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Abstract 


A type system is developed for the revised version of the VAIL programming 
language (VIMVAL) which has the following features: 


1. Type Inference: allows programs to be written with incomplete type 
specifications. The type checker infers the types of me expressions from 
their context. 


ze Polanacohian: allows modules to be written which operate on more 
than one type, performing analogous operations on different types of 
data. 


3. Higher order functions: functions are first class data in VIMVAL. — 
4. Recursive types: a type may refer to itself. 


A theory of types is developed which applies to a large class of programming 
. languages, including VIMVAL. First the notion of type is defined, then the 
interaction between types and programs is described, with a definition of type 
correctness, Type correctness is shown to be well defined and decidable, and a type 
checking algorithm is given which performs type checking for VIMVAL. 
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Chapter One 


Introduction 


VAL (Value-Oriented Algorithmic Language), developed by Ackerman and Dennis, 
of M.I.T.'s Computation Structures Group [Il], explored static data flow 
architecture [5] for a side-effect free language. Side-effect free languages implement 
functions which, when given a particular set of arguments, always return the same 
result (as opposed to languages which allow side effects, and the result of calling a 
function depends on the state of the environment as well as the explicit arguments). 
Such languages are sometimes called "functional" because they implement 
mathematical functions. Functional languages are well suited to highly parallel 
computers because changing the order in which different parts of a program are run 
(or running them in parallel) does not change the semantics of the program [7, 3}. 
The Computation’ Structures Group is now developing a new implementation of a 
revised VAL, based on an abstract data flow machine called the VAL Jnterpretive 
Machine (VIM), which executes data flow instructions directly. The revised version 
of VAL is called VIMVAL. The original VAL does not support polymorphism, 
recursive data types, recursive functions, higher order functions, or type inference. 


‘Because it was not expected that the static architecture would implement proper 


function application using data flow, VAL function calls are actually implemented 
by compile time "macro expansion", precluding higher order functions in general, 
and recursive functions in particular. VAL is a strongly typed language which 
requires that the type of every variable and formal argument be completely and 
explicitly specified. The VIM abstract machine includes mechanisms for function 
application, and the Computation Structures Group ‘is developing an 


implementation of VIMVAL. Since higher order functions introduce extra 


complexity, we decided to rework the type system for VIMVAL. Several desired 
features for the type scheme of VIMVAI. were proposed, most of which boiled down 
to: ease of use for the programmer. Ease of use has at least two components: 
"writeability” and “readability": it is easier to write programs (at least it involves 
fewer characters to write a program) in a language which requires a minimum of 
symbols, while it is typically easier to read programs written in a language which 
requires the programmer to add redundant information to a program. Thus "Ease 
of use" has different meanings for different people. Here is a set of criteria for 


evaluating the ease of use of a type system. 


- The type rules must be easy to remember, and express: they should be 
simple and consistent. 


- The programmer should not be required to write a lot of extra symbols 
just to facilitate type checking. "A lot" is subjective: Some 
programmers like to explicitly specify types, and some programmers 
find that requiring such type specification hinders them. 


- The language should be strongly typed, so that no type errors can occur 
at run time, and so that no type information needs to be represented at 
run time. 


To meet these goals, we have decided to incorporate type inference into VIMVAL. 
Type inference allows the programmer to write a program with a minimum of type 
declarations. Most types can be deduced from their context, for example the type of 
the constant 3.1415 must be REAL in VIMVAL, and multiplication of a REAL value 
by some variable x would mean that x must also be REAL. The VIMVAL compiler 
automatically determines the type of every expression, or gives an error saying that 
some expressions are ambiguously typed (i.e. expressions which have more than one 
possible type), or overconstrained (i.e. expressions which have no possible type). 
The type checking algorithm guarantees that no type errors will occur at run time. 


We adopt the strategy that the programmer should be required to write a minimum 
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number of extra symbols to facilitate type checking, while allowing a programmer to 


optionally add extra type information to a program. We will discuss how well our 


type inference system meets our goals in the conclusion of this paper. 


~ VIMVAL has the following additional features which improve the expressive power 


of the language, while adding some new difficulties to type inference that have not 
been covered by [16, 15, 14]. 


Polymorphism 


- Allows programmers to write functions which perform analogous 


operations on different types of data. One example of a built in 
polymorphic function is ARRAY —LIMH, which maps from any 
array to an integer. Polymorphism and type inference are loosely 
coupled in VIMVAL because we allow any type to be explicitly 
written, thus we need a way to denote polymorphic types. The 
main restriction on polymorphism is that a formal argument to a - 
function can not be used polymorphically, only free variables can 
be used polymorphically. 


Recursive data types 


Recursive types are allowed. In fact any type that can be written 
is allowed. Recursive types are not the same as recursive data. It 
is not possible to construct a recursive data object in VIMVAL 
because VIMVAL requires that all data objects be "semantically" 
constructed after their components are constructed. (There are 
two “exceptions” to this rule. It is possible for a function to 
operate on a copy of itself, but the circularity involved is very 


stylized, and the functions are not actually being constructed with 


self-references. WIMVAL has “early completion structures” [4], 
which have certain advantages which do not effect the fact that 
recursive data can not be built in VIMVAL.) 


Higher order and recursive functions 


Functions are first class data in VIMVAL: functions can be 
passed to and returned from functions, and functions can be used 
as parts of structures. Recursive functions are a special case of 
higher order functions. All recursive functions are defined to 
have the same semantics as a program written with explicit 
function arguments to replace recursion. (In a language with 
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higher order functions, explicitly specifying the type of a function 
can be troublesome. See [10] for a discussion of this.) 


Previous Work oes 
Semantics of Types and Type Checking 


Much work has been done recently on types. Scott [21] and McCracken [14] view 
types as retracts of the universal domain (e.g. special functions on the set of all 
objects which can be represented using strings of bits). Milner [16] views types as 
ideals (which is a special set of objects meeting certain closure conditions). 
Donahue [6] and Demers [2] claim that types are sets of operations (as opposed to 
sets of objects). This approach is contrasted with the algebraic approach, where any 

particular type is specified by its algebraic properties. We unify some of these | 
views, and following Solomon [22], we see types as sets of objects with certain 


restrictions, . 
Type Inference, Polymorphism and Undecidability 


Langmack [8] showed that two of VIMVAL’s features, type inference and 
polymorphism, can combine to make the type correctness of a program an 
undecidable problem. Langmack showed that by either requiring all formal 
arguments to be "monomorphic" (i.e. the arguments must have exactly one type), or 
requiring all. formal arguments to be explicitly typed, the undecidability can be 
avoided. Our solution to this problem is to require all formal arguments to have © 
exactly one type, i.e. formals must be “monomorphic” [16] at run-time. This rules 
out certain programs, but we believe, with the support of Milner [16], that most 
useful programs have the property that all their formals are monomorphic anyway. 


Type Inference Algorithms 


Solomon [22] implicitly described a type checking algorithm -for certain kinds of 
languages, where types can be described by regular sets, and the type declarations 


are complete and explicit. This thesis will extend Solomon’s work to embrace type 
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inference. (Also relevant is the work on type equivalence for types in Algol68 [20], 
which uses finite state machines to perform comparisons of types, but we are not 


directly concerned with such comparisions.) 


Peacock [19] designed a type checking algorithm for VIMVAL based on constraint 
propagation through a graph representing a VIMVAL program. As Peacock pointed 
out, his algorithm was driven by side effects (which is not aesthetically pleasing to a 
group working on a purely applicative language such as VIMVAL), lacked a 
correctness proof, and was not implemented. This thesis corrects and extends 
Peacock’s work by presenting a type checking algorithm, proving it correct, and 


supplying an implementation of the algorithm. 
Overview 


Our work involves type inference, and we argue that the sets of objects that are of a 
given type are in one to one correspondence with the sets of operations that define a 
type. We note an isomorphism between sets of restrictions and certain sets of 
objects: A given set of restrictions completely and uniquely describes a type, and a 
type completely and uniquely describes certain sets of objects. We go on to use that 
isomorphism between the restrictions and our intuitive understanding of types, to 
define types, because the restrictions are easy to formalize. The types then have 
certain algebraic properties (those of regular sets) which are dependent on the 


restrictions placed on them by a programming language. 


We are interested in applying the algebraic properties of types directly to implement 
the type checker, falling closer to Milner [16] and Scott [21] who are modeling type 
checking, than the algebraists who are modeling the type objects. 

| Synopsis 
Chapter 2 defines type in terms of regular sets and finite state automata: fypes are 


regular sets with a certain decidable property. Chapter 3 describes the interaction 
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between ‘pes and programs, defining ype assignments. Chapter 3 goes on to define 
lype-correciness in terms of the number of possible type assignments, and shows that 
Uipe-correciness is well defined, and decidable, and that the type assignment for a 
given program is computable. Chapter 4 describes the application of our type 
checking system to VIMVAL. In conclusion we will examine the type system in 


VIMVAL, and compare it with our ease-of-use goals. 
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| Chapter Two 


Types 


The goal of this chapter is to define the notion of type rigorously. We discuss our 
intuitive notions of types, and how well they fit some currently available 
programming languages. Then using examples from a dialect of LISP, we motivate 
several definitions, which lead to a definition of type-systems and types, which 
formalize our intuition. A type is a description of a set of objects, which have a 
certain property (the type of the objects). The description can be written as a regular 


expression, thus types are isomorphic to regular sets. 


2.1 A Discussion of Type Checking 


Types are easy to use, but difficult to describe. Intuitively, type checking is 
something which can catch certain programming errors (type errors), such as adding 
an integer to a string, or using an array as if it were a function. Many LISP 
implementations provide run time type checking, which detects type errors when 
they happen, This approach is not robust because it is difficult to determine when 
all the type errors in a program have been removed. Another approach, which we ~~ 
take, is that programs are checked statically for type correctness. In order to 
perform such static type checking, we traditionally have to put up with a loss of 
notational convenience: we may have to add extra symbols to a program to help the 
type checker, or the extra restrictions required for static type checking might mean 
that we are not be able to express a program in the way we want to. Another 
_ possibility is that the type checking system might not find all type errors (e.g. the /int 
program on UNIX does some type checking on C programs, but it is not guaranteed 
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. to find all type errors.) It is difficult to "retrofit" a programming language with 
static type checking because it is often impossible to perform complete static type 
checking. (In LISP the property that cdr of nil is never taken can not be statically 
| checked, and in C it is not possible to statically check that a pointer value actually 


points to a valid address.) 


Our type theory will follow Leivant [9] and Solomon [22], who model types as 
structural conditions on data objects: given a data object O, and a type 7, it is 
possible to decide whether O is of type 7 by examining the structure of O. This — 
approach means that types are sets of objects. In this case, 7 is a description of the 
possible “shapes” of O. We specifically follow Solomon, and claim that 7 describes 
a regular set of paths, where a path is a sequence of symbols in some alphabet 
(called the selectors) which corresponds to a legal sequence of operations on object 
O. This approach means that types are isomorphic to regular sets, and everything we 


want to know about a type can be rephrased in terms of regular sets. 


2.2 The Properties of Types 


Our goal is to define type rigorously. In order to.do this we need to deal with some 
of the restrictions that we intuitively associate with types (for example no object is 
both an integer and a real, and arrays have a “subtype”, but integers do not.) First 
we will describe selectors, then paths. Then we will discuss the restrictions, leading 


to the definition of a type-system. Finally we will define type. 


We will use LISP examples in this chapter, even though the types of LISP do not 
necessarily match the types of VIMVAL. We use the words "path" and "selector" 
informally to motivate our definitions, which appear below. The dialect of LISP 


that our examples will use has two base types: 
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Integers The only selector for an integer is /NT. There is only one path 
from an integer, and that is </N7). 


Cons cells Cons cells have a CAR and a CDR, so the selectors for a cons cell 
are CAR or CDR. All paths from a cons cell start with CAR or 
CDR. Cons cells can be built with the CONS function. There is 
a special cons-cell called NIL, which has CAR and CDR both 
NIL. The LIST operator builds a list of cons cells in the standard 
way, ending with a NIL. For example: 


(LIST X Y Z) = def (CONS X (CONS Y (CONS Z NIL))) 


We will be a little sloppy with the type of NIL in our examples, 
because NIL is a “polymorphic” value (it could be an empty 
LIST of anything), and we have not developed the tools to 
discuss NIL’s type. 


Paths for LISP are sequences with elements in {JN7, CAR, CDR}. This set is called 


the set of selectors for LISP. 


Notation: The set of selectors for a program, denoted %, is some finite set, 
which is dependent on the program being type checked. 


Elements of = will be written in uppercase italics, e.g. JN7 and CAR. 


Notation: A path is a sequence, with each element of the sequence in 2. 
Paths are possibly infinitely long. 


The length of a path x is denoted ]>1. 


If x is a path with |x] >/, then x, is the ith element of x. The first element 
of xis x. 


We write finite paths with angle brackets: x=<INT, CAR> is a path with 
x, = INT and x,= CAR. The symbol <> denotes the path of length zero 
(the empty path). 


Paths can be concatenated: if x and y are paths, then z= xey is a path, 
where if x is infinite then z= x, otherwise z,=x, for i € {1,..,J4}, and 


24579; for all finite  € {1....,.y]}. 


17 


The words tuple, and string, are often used for things which are similar to paths, but 


typically tuples and strings are finite in length. 


Consider the LISP value, O, generated by 

: (CONS 1 2). 
Here, O is a cons cell containing an integer in both its car and its cdr, the set of paths 
for O is { <CAR, INT>, <CDR, INT> }, and this set defines the "type" of O (see 
Figure 2-1). / 


Figure 2-1:(CONS 1 2) Cell, with paths: { <CAR, INT>, <CDR, INT> } 


The previous example describes a type which is a finite set of finite paths. The next 
example illustrates a type which is an infinite set of infinitely long paths. Consider 
the type equation T = CONS[T,T]. The paths for this type are infinitely long, and 
consist of any sequence of CARs and CDRs. 
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For the next example (which will give an example of a type which is an infinite set, 
all the elements of which are finite except for one) we need a few standard 
definitions, adapted from [1]]. We will also need the following definitions to define 
lype. . 
Definition 2-1: 1f A and B are sets of paths, then the composition of A and 
B, is 

Ae B =r Lorplo€ A, p € B}, 
where op is te concatenation of path o and path p. 


The definition of concatenation of paths automatically takes care of the case where 


some of the elements of A or B are infinite. 


We want to compose icopies of a set of paths, A, where ican be a finite integer, or it 
can be &, The case of a finite integer is adapted directly from [11], while we need 
an extra definition to define the case of / infinite. 
Definition 2-2: If A is a set of paths, and / is a finite integer, then A’ is 
defined recursively: 


-A = def { <> } (ie. the empty path, not 2) 


- Al(D0) =, AeAtl 


= def 
Definition 2-3: A path o is an initial segment of a path y if there is some 
path p, such that oop =y. 


Definition 2-4: If A is a set of paths, then 
A’ = def {ao] VK, I pe Al such that p is an initial segment of o }. 


If A is a set of paths, then A’ is the set of paths which are made by concatenating i 
elements of A together. A™ is the set of paths which are made by concatenating an 
infinite number of elements of A together. 


Definition 2-5: The K/eene star operator on sets, written *, denotes the 
operation 
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Intuitively, A’ is the set of all paths which are concatenations of zero or more 
elements of A. Note that we allow an infinite concatenation of elements of A. 
Definition 2-6: The Kleene plus operator on sets, written *, is 


ie io 
A™ = ger Viet},,..00} 4 


Note that A” = {O} UAT. 


Now we have the tools to examine an interesting type in our LISP dialect. The type 
LIST[U] is useful in LISP, and our type system can express the semantics of this 


type. 


Given a cons cell O of type 7 with car of type U(where U is the set of legal paths for 

an object of type U), and cdr of the same type as O (i.e. any operation legal on O is 

also legal on cdr(O), making 7 a recursive type), we have | 
T = {<CDR> }" © {<CAR> } o U. 

An object of the type shown in Figure 2-2 might be generated by (LIST 1 2 3), 

where U is { <JNT> } in this case. Note that T is a regular set, and can thus be 

accepted by a finite state automaton if U is a regular set. | 


Note also that one of the elements of T is the infinite path x, such that x; = CDR 


for all positive integers é 


The examples we have presented have types which can be represented by regular 
sets. Solomon [22, 23] showed that the only types we should consider are the ones 
which can be represented by regular sets. We place an additional restriction, (the 
details of which are dependent on the programming language that the type system is 
being implemented for), that some regular sets are illegal as types. In our dialect of 
LISP, for example, the set { <CAR>, <INT> } is illegal, because there is nothing 


which has a CAR and is an integer. Thus for a given programming language there 


Seba IARC HE ON am ana Ea EE Sw 


CAR int 


Figure 2-2:Recursive type, with an object of the type 7=CONS(INT,T) 
(Also known as LIST[INT]), along with the FSA which accepts 7. 


are selector classes which provide the information to check for illegal sets like 
{ <CAR>, <INT> }. 


We require that the selector classes for a given programming language, partition 2 __ 


into equivalence classes. 


In VIMVAL, each equivalence class in the selector classes represents a different 
“type class” or “type generator" (such as ARRAY, RECORD or INT). This 
method of partitioning 2 would allow us to generalize our type system to include 
abstraction, and this possibility is discussed briefly in the conclusion. It is not 
essential to our work on the type checking algorithm that the selector classes are 


formed according to the rule that each class corresponds to a "type generator”. 
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The selector classes for our LISP dialect are 


-{ CAR, CDR } 


-{ INT} | 
Some selector symbols can not be followed by any other selectors. Our LISP dialect 
does not allow paths of the form </N7,CAR.,,...>, because that would imply that 
there is some object which is an integer, which has a CAR. (It is not clear what such 
a path would mean, but we do not want it.) Thus, for a given programming 


language, some elements of 2 can only appear as the last element of a finite path. 


Notation: The set of ferminators, a subset of 2, is the set, defined by the 
programming language, such that any path having a terminator in a non- 
final position is illegal. 


In our LISP dialect, { [NT } is the set of terminators. 


In VIMVAL and our LISP dialect, the terminators correspond to “scalar” types, or 
"base" types. We do not, however, require that such a correspondence hold for our 


type checking algorithm to work. 


A few extra definitions are needed to define types. We want to be able to talk about 
the "first part” of a set of paths, and the “last part" of a set of paths, so that types can 


be described in terms of these properties. 


Definition 2-7: The head of U C X* is the set of first elements of the 
paths in U. 
heaK U) =def Urey %y 

Definition 2-8: The rest of a path o € =* is o with the first element 
removed. 
| rest(a) = der P such that<o,>°p =o 
Definition 2-9: The ‘ail of UG X* is U with the first element of every 
path removed. 

tail(U) = def {o]3p € Uwhere o = resp) } 
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Definition 2-10: If X € © then the X-selected tail of UC =* is 
tail,(U) = yoy { resity) |y € Uand y, = X}. - 


Now we can encapsulate all the type information for a given programming language 
into a fype system. Type systems are dependent on their programming language: the 
correctness of a type system depends on the semantics of the programming language 
associated with it. We often refer to a ype system as a programming language in this 


paper, because of this dependence. ( 


Definition 2-11: A type system L is a three-tuple <2, C, $> where 


- Z is the set of selectors in L, 
- Cis the set of selector classes, which partitions =, 


- and F is the set of terminators. # C 2. 


In order to define type, we need to be able to talk about certain properties of regular | 
sets which are easily defined recursively. One such property is that for any selector 
o, the o-selected-tail of a type, 7, must also be a type (or be empty). This recursion 
could be a real problem: e.g. for the type LIST[U], the CDR-selected-tail of the type 
LIST[U], is LIST[U]. There is no obvious way to terminate the recursion. By 
constructing a finite state automaton (FSA) which accepts the regular set, we can 
perform the tests we are interested in without resorting to such infinite recursion. 
The following definitions, which describe properties of FSA, were adapted from 
(1].- 


St sural ee rae aR ONE 


Definition 2-12: A FSA is a tuple (K,2,6,s,F,%) where 


- K isa finite set of states, 
- Z is an input alphabet, 
- 6 is a function mapping some subset of Kx into K, 


- $ is a start state (s € K), 


- F is a set of accepting states (F C K), 


- and & is a reject state (& € K), 


and 5(%,a) is undefined for all o € Z. 

Definition 2-13: A configuration of an FSA is a pair (k,o) with k € K and 
o€= 

Definition 2-14: A binary relation t-,, holds between configurations of — 
M, an FSA. (k,o) Ky (Ko) = 8(k,o,) = k, and rest(o)=o'. In which 
case we say that (k,o) yields (k’,o’) in one step. We denote the reflexive 
transitive closure of Fy, as Ke If 5(k,o,) is undefined, then 
(k,o) +4, (&,0). (0; is the first element in the path o.) 

Definition 2-15: An FSA, M, accepts a path o if the following hold: 


- If o is finite then (s,0) +}, (<>) for some f€ F. 


- If o is infinite then it is not true that (s,o) Ke (%,o’) for any path 


Co. 


Note that if a FSA reaches a configuration (k, 0), where 6 is undefined, then the 
FSA “hangs”, and never accepts its input. Specifically, if a FSA reaches the state %, 


then the input 1s not accepted. 


The set of paths accepted by an FSA is a regular set of paths, and 1s called the set 
that the FSA accepts. 


24 


We now have everything needed to define ‘ype. 


Definition 2-16: 7, a regular set of paths, is a ‘ype in a programming 
language <2, C, > if there is some FSA, M = (K, %, 6, 5, F, ®), 
accepting 7 such that 


- M rejects <>. 


- Given a state k, if H, = {0 € Z| 8(k, ) is defined }, then H, isa 
subset of ‘some selector class in C. 


- For every state k € K, and every symbol X € &, if 5(k, X) € F, then 
X €¥. (Terminators occur only at the end of finite paths.) 


It is not necessary to force M to be unique in the definition of type, because if Tis a 
type, and N is an FSA which accepts 7, then N meets the conditions imposed on M 


in the definition of type. We leave this assertion without proof. 
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Chapter Three 


Type Checking 


Now that we have defined types, we can define type-checking by specifying the 
interactions between‘types, and programs. A program has a set of nodes! that we 
want to type (to type node A is to assign a type to NV), and some information about 
the types of the nodes (which we call operators). We first lay some groundwork, 
defining concepts such as program and Iype-assignments, and then define 
type-correciness in terms of the number of possible type-assignments for a program. 
We conclude this chapter by showing that type-correctness is well defined and 
decidable. 


3.1 Type Assignments and Programs 


Our type checking algorithm will try to infer the type of every node in a program 
from its “context”. We need to specify what we mean by the “context” of a node, 
and to do that we will define three kinds of "operators" on nodes: parameterized 


restrictions, containers, and closures. 
Notation: The set of node names is denoted N. N must be disjoint from 


z. | 


N might be infinite, but any given program will only use a finite subset of N. 


A type assignment gives us a way to associate a type with a node in a program. 


INodes are roughly equivalent to expressions, except that there may be some expressions that we 
will not want to type (for example expressions in a module which is never uscd), and there some 
things that we might want to type which are not expressions (for example type declarations). See [19] 
for a more complete discussion of nodes. 
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Definition 3-1: A type assignment R is a regular subset of Ne” such that 
Wx€head(R), tail (R) is a type. 

Notation: The set of all type assignments is denoted SOTA, air Subsets of 
SOTA, are elements of the power set of SOTA,,,, written (Sota, ,). 


There is an interesting isomorphism between {ype assignments and mappings from 
P.NodeNames to types. Given a program P, and a type assignment 7, there is a 
mapping U: N97 such that U(n) = tail (7). Conversely, given a mapping U, there is 
a type assignment 7, such that tail (T) = Un). We named type assignments fype 
assignments because they are isomorphic to mappings which assign a type to every 
node, and we will freely, without warning, use this isomorphism when it is 


convenient. 


We are interested in finding which type assignments are consistent with the 
"context" in which each node appears in a program. 
Definition 3-2: Given an alphabet ., if o € A, o finite, and R is a regular 
set over A, then the regular set after o in R is 
after (R) = def {o’| aro’ ER}. 
Note that for a symbol x € A, tail (R)= after, (R). 


A parameterized restriction gives us the ability to say that two nodes are the same 


type. First we can specify the two nodes n and n’ that we are interested in by giving __ 


two paths, o and o” respectively. Any FSA which represents a type assignment 
which is consistent with a given parameterized restriction has the property that if we 
start from the start state of the FSA and o and o’ lead to states k and k’ respectively, 
then the languages accepted by starting at k and k’ must be the same. This is 
equivalent to saying that there must be some FSA accepting the same language such 
that k=k’. We formalize this with the definition of state-equivalent, and then define 


parameterized restriction. 
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Definition 3-3: Given a regular set R, two paths o and o° are 
state-equivalent if after (R) = after (R), in which case we write 6 =p 0°. 

Definition 3-4: Given a set of regular sets A (with every regular set in A’ 
over a fixed alphabet A), for every pair (a, o') € AXu’, with o and o° 
finite, there is a set of regular sets { R| R € A and o =, 0°}. We call this 
set the parameterized restriction of (o, 0°), and write the set as A, 


=o” 


A container gives us the ability to say that the type assignment for our program has a 
given path o in it. | | 
Definition 3-5: Given a set of regular sets A (with every regular set in A 


over a fixed alphabet 4), for every’o € A’ there is a set of regular sets 
{R|R€EAando € R }. We call this set the container of o in A, and 
write the set as A - 


A closure gives us the ability to say that a given node must have selectors which are a 
subset of some finite set of selectors. We choose the node by giving a path, and 
specify the set by listing it. 

Definition 3-6: Given a set of regular sets A (with every regular set in A 


over a fixed alphabet A), a finite set of symbols % € A, and a finite path 
o € A, there is a set of _ regular _ sets 


{R| REA, and head(after (R)) C & }. We call this set the closure under 
% of R selected by o, and write the set as A pay 


Now that we have defined the kinds of restrictions we want to make on type 
assignments, we define an operator to be one of those restrictions. A program will 


actually consist of some nodes and some operators. 
Definition 3-7: An operator OP is a subset of SOTA,,, which is either a 
parameterized restriction, a container, or a closure of SOTA,» 


Notation: If OP is an operator, then the operands of OP are the node 
names mentioned OP. 


The meaning of an operator is that if there is some restriction on the types of some 


nodes in a program, the operator contains the information describing the restriction. 
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For example, if, given a program, we have an operator which requires (informally) 
that “if the type of node 1 is T, then the type of node 2 is LIST[T]" then the operator 
is{ R] RESOTA all and <I> =p 62,LIST }. A more concise way of writing this set 
is (SOTA W202, LIST 

Definition 3-8: A program P is an ordered pair (NodeNames, ops), 


- where NodeNames is a set of node names (a finite subset of .N), 
referred to as P.NodeNames, 


- and ops is a finite set of operator's, where each operator's operands 
are a subset of the names of the nodes in a program. (i.e. 
Vx€ops, head(x) € P.NodeNames.) This set is referred to as P.ops. 


Notation: The set of all programs is referred to as TI. 


By taking all of the operators in a program, and combining their information, we can 


deduce the type assignment for a program. 


Notation: The intersection of all the operators in a program is called the 
complete-restriction of the program. 


Definition 3-9: o%Sez: IT + 9(SOTA,,)) is a function mapping programs 
into sets of type assignments. Given a program P, 0%S89(P) is defined by 


3.2 Type Correctness - "There is a solution" 
Definition 3-10: A program P is type correct if jo%Sea(P)| = 1. 
Definition 3-11: A program P is type ambiguous if |o%Ssa(P)| > 1. 


Definition 3-12: A program P is type overconstrained if |o%Sea(P)| = 0. 


Theorem 3-13: Type correctness is well defined, and is independent of the 
order in which the restrictions are examined for a given program. 


Proof: Set intersection is associative and commutative. § 
Peacock’s proposed implementation of type checking for VIMVAL [19] used a graph, 


through which information about the restrictions of the operators of a program was 
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propagated. Peacock’s thesis posed the question: "Can changing the order in which 
constraints are propagated through the graph change the final answer?". We can 
answer "no" to this question because if a and B are such that given a regular set R, 


R,andR pare operators, then it is true that: 
(Rag = Ryda 


We accept without proof the following: 


Proposition 3-14: If a program is type correct, then no “type errors” (in the 
intuitive sense) will occur while running the program. 


This is difficult to prove, because it is dependent on the semantics of the language 
the program is written in. Even if the language’s type system conforms to our 
model, the correctness of type correctness depends on how accurately the set of 
operators for the language is described. Given a careful semantic model for a 
programming language, and a set of operators which are consistent with the model, 
a proof of this proposition would involve showing that if the local constraints 
imposed by the operators are true then no type errors will occur at run-time. 
Milner [16] proves this proposition for the language he considers. We will leave this 


proposition unproven for VIMVAL. 


3.3 An Algorithm for Determining Type Assignments 


Theorem 3-13 shows that we can talk about type correctness for incompletely typed 
programs with recursive types, and gives a definition of type correctness, but it does 
not give us an algorithm for determining those types. In this section we will prove 


that there is an algorithm for computing the type assignment for a given program. 


If x is the intersection of a finite collection of operators, we need to show that it is 
possible to compute whether |>x{=0, }x]=1 or |xP1. If Jxf=1, Le. x = { y } for 


some type assignment y, then we need to show that we can actually compute y. 
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Specifically, we need to be able to build a FSA which accepts y, so that the VIMVAL 
compiler can use the type information to compile a program. (Other representations 


of regular sets would be equivalent to building a FSA which accepts y [11].) 
Theorem 3-15: Given a program P, the type correctness of P is decidable. 
If P is type correct, then the type assignment is computable. 


Proof: Suppose P has operators equal to the union of some containers 
described by the set of paths { x, | /=1,....1 }, and some parameterized 


restrictions described by the set of pairs of paths { (y, z) | /=1,...,m }, and 
some closures { (3, w,) € P(A) x A’ | i=1,...,/ }, where A = NUX. 


We need to determine how many type assignments (which are regular 
sets) there are that are elements of every operator in P. Since type 
assignments are regular expressions, we can consider the FSA’s which 
accept the type assignments. In general, there will be more than one FSA 
which accepts a given type assignment, but we can consider, without loss 
of generality, the set of FSA’s with no more than p states, where 


p =| P.NodeNames| + 27_, |x] + ZL, (yl + I4D + 24 Iw] + 3. 
The reason we can make this reduction is that the set of FSA which accept 
the languages described by any operator all have a bounded number of 
states, thus the set of FSA which accept languages in the complete 
restriction of a program also have a bounded number of states. Our 
bound is correct because if there are two languages meeting the 
restrictions of operators of the program, then there are two which need at 
most p states: it is possible that every time a node or symbol is mentioned 
by a operator, another state will be needed, plus we add one for the 
rejecting state, one for an accepting state, and one for an “unconstrained” 
state which can be used to make type assignments different for two FSA 
(assuming the unconstrained state is reachable from the start state). 
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Since it ts decidable whether the language accepted by a given FSA is in a 
given operator, we simply need to generate a list of all FSA’s with less 
than p states, filter out the ones which do not accept a type assignment, 
and determine which of them are members of every operator in P. Given 
this new sct of FSA which are in every operator of P, we need to 
determine whether they all accept the same language, which is decidable. 
If they do, then the program is type correct. If they do not, then the 
program is type ambiguous. Of course, if there is no FSA which accepts a 
language which is in every operator of P, then P is type overconstrained. 


If a program P is type correct, then the type assignment is the language 
accepted by one of FSA’s that is found by the algorithm described above. 
It is not really satisfying to be forced to use an algorithm as inefficient as the 
algorithm described above for determining type correctness. This algorithm is 
exponential in the size of the input program since the the number of FSA’s of size p 


is exponential in p. 


VIMVAL, the actual language we are trying to type check, has very stylized 
operators, we were able to find an algorithm for type checking which is usually more 
efficient. Chapter Four describes VIMVAL in more detail. 
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Chapter Four 


Type Checking in VIMVAL 


This chapter describes the types of VIMVAI. [24], and how VIMVAL interacts with 
the type system developed in chapters 2 and 3. We deal with function recursion and 
polymorphism so that our type system can handle VIMVAL, then we describe the 
operators of the VIMVAL language. 


4.1 The Semantics of Modules 


A VIMVAL program consists of a set of modules, which can be compiled separately. 
Modules may use free names, which are references to other modules. The 
bindings of the free names are resolved at link time, possibly with the explicit help 
of the programmer. VIMVAL allows a module M with a free name "P" to to bind 
“P" to N, even though the name of module N is not "P”. Unfortunately, the 
programmer may be required to help the linker resolve free names. 


Every module is really a generator: when a module is bound to a free name, the 


module is augmented in whatever ways are possible and necessary to bring it into _ 


conformance with its use (i.e. it is copied, and then modified). Thus, when a 
programmer uses the built-in ARRAY-SIZE function in VIMVAL, a copy is made so 
that whatever type constraints are added to the ARRAY-SIZE function (for example 
if the programmer uses it on an array of integers) are not propagated to other uses of 
_ the ARRAY-SIZE function. 


Note that we do not require that there be a unique type assignment for each 


module, only that there be a unique type assignment for each augmented version of 
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every module. The semantics of modules does not specify that a module must be a 
function. A module could be some other kind of value, or even a second-class value 
such as a type, since the type restrictions for each of these cases could be expressed 


as operators. 


After a copy of a module is made, the type checking system must decide on exactly © 
one type for the module. This implies that all the types of the subexpressions of the 
module must have exactly one type: in particular the arguments to functions must 
have exactly one type. This precludes certain programs which use "run-time" 


polymorphism (such as the “standard” LISP interpreter). 


4.2 Recursive Functions 


VIMVAL allows functions to call each other recursively, with the restriction that 
there can be no mutual recursion between modules. (Mutual recursion between 
functions defined inside a module is allowed.) All recursive functions are really 
treated as higher order functions, which pass other functions, perhaps copies of 
themselves, around. This implies that recursive functions, whether directly or 
indirectly recursive, must be converted to passed arguments. Because arguments 
must have a fixed type, functions must be of fixed types when used recursively. 
Recursion is treated as a syntactic sugar for functions which explicitly pass other. 
functions around [7]. Program examples 4-1 and 4-2 illustrate a simple case of the 
desugaring process. 
Program Example 4-1: 
% An example of recursion 
function fact(i:INT) RETURNS (INT) 
IF 1<=1 THEN 1 
ELSE i*fact(1-1) 
ENDIF 
ENDFUN 


Program example 4-2 shows program example 4-] "desugarfied". The approach 
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taken is to translate fact into a routine which calls dofact, which does the actual 
computation. 


Program Example 4-2: 


function fact(i:int) RETURNS (INT) 
facttype = FUNCTYPE(INT,FACTTYPE) RETURNS (INT) 
function dofact(i:int,f:facttype) RETURNS (INT) 
if i<=1 then 1 
else i*f(i-1,f) 
endif ~ — 
endfun % dofact 
dofact(i,dofact) 
endfun % fact 


There are more complex cases of mutually recursive functions. They are dealt with 
in the general case by translating 


a : FUNCTION(<args>) (<rets>) IS 
expression. -with-these-subexpressions: 


a 
B(..-) 
¥(...) 
END a@ 
B : FUNCTION(...) .... END B 
y : FUNCTION(...) .... END y 


where f and y call « (directly or indirectly) into 
a : FUNCTION(<args>) (<rets>) IS 
do-@ ; FUNCTION(<args>,a,b,c) (<rets>) IS 
expression, -with-these-subexpressions: 
b()(...,a,0,¢) 
c()(...,a,0,¢c) 
END do-a 


do~B : FUNCTION(...,a,b,c) .... END do-B 
do-y : FUNCTION(...,a,b,c) .... END do-y 


do-a(<args>,do-a,do-B,do-y) 
end @ , 


Of course this only translates a. A similar translation would need to be made for B, 
so that B could be called directly. The following are some design considerations that 
we took into account when we made this decision: 
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> We wanted VIMVAL. to have a decidable type system, and found that, 
theoretically, if we do not "fix" the type of recursive calls, the type 
becomes undecidable [8]. 


- We wanted an easy to understand type system. Aesthetically, an unfixed 
type becomes very confusing on even rather simple examples of 
recursion (see program example 4-3). 


- Practically, very few programs need the extra expressive power of 
unfixed types on recursion [16]. 


Program Example 4-3: 


function F(A,B) 
F(A,B) 
"F(B,A) 
ENDFUN % F 
F(1,1.0) % difficult to type 
function F1(A,B,F2,F3) 
"F2(A,B,F2,F3) 
"F3(B,A,F3,F2) 


ENDFUN % F1 
F1i(1,1.0,F1,F1) % Much easier to type: 


‘It is very difficult to give F a type in this example, because it is acceptable to pass F 


anything as arguments, but the arguments are switched halfway, resulting in a 


confusing type. If we write Fl instead, we can get the same meaning, but the = 


program is much easier to type: 


FlaTYPE=FUNCTYPE(INT,REAL,FlaT YPE,FIbTYPE) RETURNG(...) 
_ FIbTYPE=FUNCTYPE(REAL,INT,FIbTYPE,FIbTYPE) RETURNSC(...) 
The type of F1 when called at the top level is FlaTYPE. 

The type of the third argument is FlaTYPE. 

The type of the last argument is FIDTYPE. 


An example of the power of this kind of recursion is given in program example 4-4, 


which shows how a standard LISP function, is easily written recursively in VIMVAL. 


(oe AL Yn Ai A ch A Di i a Ss i SE cP at RAMI AI APR TS Sm ke Ae ER WR etm FE Bioue, +e : ae wie ~ fee Ra te hes eet 


(We also omit of the type declarations to demonstrate the ease of use of type 
inference.) 
Program Example 4-4: 


function LENGTH(1) 
tagcase 1 
tag NuliVal: 0 
tag ConsVal: 
1+length(1.cdr) 
endtag ~~ 
endfun % tength 


4.3 "Constant" copying 


After dealing with recursion, the remaining free variables in each module are treated 
as invocations of a generator (either of a type, or a value), which does away with 


polymorphism (since after being copied, every node must be assigned exactly one 


type). 


4.4 The Restrictions for VIMVAL’s operators 


The actual restrictions for the operators of VIMVAL are presented in appendix A. 
VIMVAL does not need the full expressive power of operators: we have described 
VIMVAL using: 


Simplified closures 

Closures are specified by a set of symbols %, and a path o. 
VIMVAL operators are simple enough that o can always be 
written as a path of length zero or one. If the path is of length 
zero, then the closure gives a complete list of all the node-names, 
Our implementation assumes that the node-names mentioned in 
the operators are all the node-names in the program, which is 
slightly easier to use than if the implementation required that an 
explicit list of all the node-names be presented to the type 
checker. If the path is of length one, then o must be of the form 
<n> where n is a node-name. 
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Simplified containers 
Containers are specified by a path o. VIMVAI.'s operators can be 
written in such a way that all the containers are specified by paths 
of length two: the first element is a node-name, and the second is 
a terminator (which is a selector). 


Simplified parameterized restrictions 
Fither, we have Agny=<m> or A gn g>s=<m>" where m and n are 


node-names, and o is a selector. (The general form of operators 
allows parameterized restrictions of the form A _ B where a and 


B are arbitrary elements of Ne’), 
These restrictions allow a great improvement in the implementation of type 


checking in VIMVAL. 


4.5 An Efficient Algorithm for Type Checking in VIMVAL 


Our technique is to maintain an equivalence relation over node-names, which 
reflects which nodes are of the same type, information about the closure for each 
node, and information about the transitions that any FSA which represents some 
member of our complete-restriction, must follow. Hence, in most cases, we are able 
to rapidly reduce the upper bound of the number of states that FSA which accept 
our complete-restriction, by considering each equivalence class in the equivalence 
relation to represent one state of the FSA. The system requires at least one node- 
name in every equivalence class to have a closure restriction (because otherwise, it 
might be possible to have extra transitions leading from any state, destroying the 


uniqueness of the type assignment). 
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Definition 4-5: A Meta Finite State Automaton (MESA). is a mee 
(K, A, %, s, F, X, 8, ©), where 


- K isa set of states, 
- Ais an accepting state (A € kK), 


- % is an equivalence relation over K (If k € K then 36(4) the class of 
k under %), 


- sis a start state (s € K), 


- Fis a set of final states (F is the union of some of the classes of %, 
which implies FC K. A € F), 


- Y is a set of symbols, 
- 6 is a function mapping (% x 2) > (% U {@}), 


- and C is a function mapping % — #(Z). 


Definition 4-6: A configuration of a MFSA is a pair (k, o) where k € K 
anda € = 

Definition 4-7: A binary relation re holds between configurations of M, 

a MFSA. (k, o) ty, (K, 0”) & 0” = resto) and 8(%(K), o)) = %(k’). 

The reflexive transitive closure of av is denoted as BG 


So far, MFSA are very similar to FSA. Now we are going to define some interesting --- -- - 


operations which allow us to perform our type checking algorithm. First we are 
interested in restricting the set of FSA’s that our MFSA represents to those which 
correspond to one of the cases of a simplified parameterized restriction. (See section 
4.4.) 
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Definition 4-8: If Af is an MFSA, and /and are states then 
equat(M,ij) = def (K, %’, s, F’, Z, 8", c’), 


where b € %'(a) (i.e. a and b are in the same class under %') = there is 
some finite path o such that (7, 0) KM (a, ©), and (/, o) oe (6,4), 


and F’ is the union of all the elements of 9° which have some element in 
F, . 


and 5(%'(a), a) = £(b) = 3 (x, y) € &(a)x%(5), such that 
5(%6(x), a) = KG), 


and C(y) = A, ¢ sy) C(%(2)). 


The equate operation on MFSA gives us the set of FSA’s in which a given pair of 


states are always state equivalent. 


Next we are interested in the case of a container. 
Definition 4-9: If M is an MFSA, ké€ K,and a € &, then 
has— path(M, a) = def M, 


where if there is some x € 5(%(k), a) then M’ = equate(M, x, k), 
otherwise M’ = M, except for the transition function 5’, which is the 
same as 6, except that 8’(%(k), a) = %(A). . 


The next definition allows us to deal with the second case of a simplified 


| parameterized restriction. (See section 4.4.) 
Definition 4-10: If M is an MFSA, i, 7 € K, and a@ € &, then 
has —subpath—to(M, i, a, jf) = gop M’, 


where if there is some x € 6(%(/), a) then M’ = equate(M, x, j), otherwise 
M’ = M except for the function 5’, which is the same as 6, except that 


5(K(i), a) = %(). 


Note that a MFSA describes a set of type assignments if the following conditions 
hold: 


1. For any node-name a, { a | 5(%6(1), «) # S } is a subset of some selector 
class, and is also a subset of €(36(n)). 


2. Ifn€ K,mé€ F, a € 2, and 6(3(n), a) = %(m) then a is a terminator. 


3. For every X € Z, 6(K(.4), X) = @. 


Note that a MFSA describes a single type assignment if the following condition 


holds: ove 


1. For every node-name n, { a | 8(%(11), a) # @ } = C(K(m)) # . 


To compute equate(M,i/), has—path(M,i,a), and has—subpath—to(M,i,a,j) only 


takes on the order time 7” in the worst case, and usually is much better. 


To compute the type assignment for a program, we perform the following: 


1. Build the MFSA with all the closures matching the closure operators in 
the program. (This is easy: if a node z is closed with the set % in the 
program, we have the function ¢(z) = %. Ifa node z has no closures in 
the program, then C(z) = %.) 


2. Construct new MFSA’s, by composing the MFSA operations which 
correspond to the operators in the program. It does not matter which 
order they are composed in, since the MFSA operations describe set 
intersection: if A is a program operator corresponding to some MFSA 
operation F, and B is a set of type assignments corresponding to some 
MFSA M, then ANB is a set of type assignments corresponding to the 
MFSA F(M). Here is the correspondence between program operators 


and MFSA operations: » 

Program Operator MFSA Operation 

(SOTA went equate M,n,m) 

(SOTA an, =i has— subpath—to(M,n,o,m) 
(SOTA Weng) | has—path(M,n,o) 


3. Test to see if the MFSA represents a set of type assignments (in which 
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case we know that the program is not ‘pe-overconstrained), and if the 
MFSA represents a single type assignment (in which case we know that 
the program 1s not 4pe-ambiguous). 


Appendix C contains the listing of a CLU [13] program to perform type checking on 
VIMVAL. 
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Chapter Five 


Conclusion 


Did We Mect Our Goals? | 


While the VIMVAL compiler is not yet finished, and we have no actual experience 
using VIMVAL, we feel confident that VIMVAL has much the power and ease of use 
stated in our original goals. This power is illustrated by a few examples in Appendix 
B. We believe that VIMVAL provides a notation for polymorphic programs that is 
easy to learn and use, and we proved that VIMVAL is type safe, meeting the high 
level goals outlined in the introduction. The actual type rules of VIMVAL are fairly 
_ simple: ; 
- There must Le exactly one legal pe for every value in a VIMVAL 
program. 


_> The type of a value is constrained by the operators that operate on the 
value. The VIMVAL manual [24], and appendix A, describe the 
constraints that each operator places on its operands. Intuitively, the 
arguments have to be used in a “consistent” way. (This is easy to state, 
but sometimes rather difficult to apply in practice, since the human 
programmer may have to actually use our cesses to determine the 
type assignments of a program.) sae te 


- Recursive functions are of a fixed type, but other modules are copied 
before they are compiled, which allows polymorphic functions to be 
written. 


VIMVAL requires a fairly complex type checking algorithm, which may require 
quite a bit of computation in the worst case. We believe that this complexity is 
acceptable in the light of VIMVAL’s ease of use, and given that VIMVAL is designed 


to run on a highly parallel computer. 
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Type inference allows programmers to write code which is difficult to read. 
Empirically, we could argue that if type inference is difficult for a computer, it is 
probably also difficult for people who are reading a program. (e.g. We found it 
difficult to infer “in our heads" the type of the Y-combinator (shown below) but our 


type checking algorithm correctly computed the Y-combinator's type.) 
Comparison with other Work 


VIMVAL’s type system is different from Milner’s [16], in that we allow "ad hoc 
polymorphism” in the case of certain built in operators (such as +, which can take 
real or integer arguments). Milner discussed the possibility of adding such ad hoc 


polymorphism. 


A more important difference between our type system and Milner’s is that we allow 
recursive types. The recursive types allow us to type Curry’s Y combinator (which 
Milner’s system can not type). 

Program Example 5-1: 


function Y¥(f) 

function f1(x) 
f(x(x)) 
endfun 

fi(fi) 

endfun 


which could be re-written without type inference. 
Program Example 5-2: 


Ytype = Functype(Ytype) returns(Ytype) 
function Y(f:Ytype) returns(Ytype) 
function fi(x:Ytype) returns(YType) 
f(x(x)) 
endfun 
f1(f1) 
endfun 


Except for the above differences, our concepts of type and sets of type assignments 
are not really different from Milner’s. Instead of finding the "most general type” of 


an expression, and then instantiating the expression with specific types to get a 


“monotype”, as Milner does, we copy the expression, and then deduce what the type 
of the expression must be. These approaches are equivalent, because a "monotype" 
is a member of a "most general type” if and only if there is context in which the 


expression could have /ype corresponding to the "monotype". 


Our approach to types can be generalized to include type abstraction [12] by 
defining a correspondence between the legal opcrations on user defined abstract 
types and an augmented selector alphabet: abstract types are sets of objects with a 
set of operations [17], and a type checking algorithm would simply generate the 
additional selectors that the abstract type needs (which are different from the 
previously defined selectors), and put them all in the same selector class. None of 
the new selectors would be terminators. The rest of our type checking system would 
apply to this new system. We did not make this generalization because we wanted 
to limit the scope of this work, and because VIMVAL is perceived as a “number- 
crunching” language, which does not require the powerful and easy to use 
abstraction mechanisms that are found in CLU [13]. VIMVAL does have a type 
abstraction mechanism, which involves encapsulating a data type inside a procedure, 
but the mechanism is not easy to use (syntactic sugar would help solve this 
problem [14]), and it is impossible to maintain a representation invariant for objects 


of a given abstract type [12] (such as a requirement that an array be a sorted array). 
A View from Above 


The "high level goals” for the MIT Computations Structures group were well stated 


in [3]: 
to present a system model for a kind of ideal multiprogrammed computer 
system, one that would serve many users in a way permitting sharing of 
the products of their individual programming efforts consonant with the 
principles of program modularity -- the ability to build program units 
which can be combined to form higher units, etc. 


We believe that the development of the type system for VIMVAL is an important 
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milestone in the development of the VIMVAL language, which in turn represents an 


important step on the path to that high level goal. 
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Appendix A 


VIMVAL Operators and their Restrictions 


This appendix describes the actual operators that are in VIMVAL. Much of this 


appendix is borrowed from Peacock’s [19] appendix A. 


We adopt the convention that every operator has ” input nodes, named X),...,.x, and 
m result nodes named y,....,y,,. An operator is set of regular sets, and we give the set 
for each operator. 


> = 
{ REAL, INT, CHAR, BOOL, NULL, ARRAY, STREAM, 
GET-a, [S-a, ARG-n, RET-n 
] @ is a legal VIMVAL identifier, and nm is a positive integer } 


The correspondence between selectors in our type system, and the "type classes" in 


VIMVAL are as follows: 


selector type class 
REAL —< REAL 
INT + INT 
CHAR ~ CHAR 
BOOL ae CHAR 
ARRAY + ARRAY 
STREAM + STREAM 
GET-« + RECORD 
IS-a + ONEOF 
RET-n, ARG-n a FUNCTION 
The terminators are 


{ REAL, INT, CHAR, BOOL, NULL }. 


The selector classes are: 


4] 


{ REAL }. 

{ INT }, 

{ CHAR }, 

{ BOOL }, 

{ NULL }, 

{ ARRAY }, 

{ STREAM }, 

{ GET-a | a Is a legal VIMVAI. identifier }, 
{ [S-a | @ Is a legal VIMVAIL. identifier }, 
{ ARG-n, RET-n | n is a positive integer }. 


We will call the set of all type assignments 0. 


There is a little bit of added complexity due to the non-uniform polymorphism of 
some of the operators in VIMVAL. The + operator, for example allows arguments 
which are either all integers or all reals. We can deal with such finite disjoint unions 
of operators, by computing a separate complete-restriction for every possibility. We 
will refer to { INT, REAL, CHAR, BOOL } as RICB, { REAL, INT } as RI, and 
{ REAL, CHAR } as RC. 


Most operators in the VIMVAL language correspond to more than one operator as 
defined in definition 3-7. Rather than write the operators in the form (SOTA,,),_ 
i 


for /in some set of integers, we will write the restrictions in standard set notation. 


We will also choose not to mention the closure operator for operators which 
mention selectors which are in selector classes of order one. This set of selectors is 
OWNCLASS = { REAL, INT, CHAR, BOOL, NULL, ARRAY, STREAM }. In 
general, if an operator specifies that there is some path <z, o>, with 
o € OWNCLASS, then there is an implied closure operator of the form 


(SOTA,,) A{o} 


A.1 Basic Operators 


A.1.1 Error Tests 
There are three universal error tests in VIMVAL. Their names are is-undef, 
is—miss—elt, and is—error. They have 1 input and 1 output. Their only constraint 
is that the output must be boolean. 
( S€O | < y,;, BOOL> Es} 
A.1.2 Equal and Not Equal 
Equal, (=), and not equal, (~=), are in a special class because they constrain their 
argument types not to a specific type but to a set of four possible types, namely real, 
integer, char, or bool. They have 2 inputs and 1 output. The inputs must be the 
same type and the output is a bool: Thus there is one operator for every p € RICB. 


V pe€ RICB: 
{ SEO | { <x, PD. <x,, p>. <y,, BOOL> } € S } 


A.1.3 Boolean Operators 
There are two classes of boolean operators in VIMVAL. The first class has two 


arguments, the second has one. . 


A.1.3.1 Two Argument Boolean Operators . 

The members of the class with two arguments are and, (&); and or, (]). Their 
constraints are that all the inputs and results must be bool. 

{ SEO | { <x,, BOOL>, <x,, BOOL>, <y,, BOOL>}C€ Ss } 

A.1,3.2 One Argument Boolean Operators | 

The second class has only one member, the not, (~) operator. The input and result 
are both bool. 

{ SEO | { <x,, BOOL>, <y,, BOOL>} ¢ S$ } 


49 


A.1.4 Type Conversion Operations . 
There are three operations intended to convert one data type into another. These 


- are real, character, and integer. They all have one input and one result. 


real { SEO] { <x, INT>, <p), REAL} CS} 
integer v pERC: { SEO] { <x), p>, <y,, IND} C S } 
character -- { SEO] {<x,, INT, <y,, CHARD } C S} 


A.1.5 Real and Integer Operations 

Most real and integer operations have the same names. Those that do are divided 
into four classes. There are some special cases, which are described after the four 
classes. | 

A.1,5.1 Binary Operators 

The first class takes two arguments and returns one result, all three types being the 
same type, and being real or integers. The members of this class are plus, (+); 
minus, (-); multiply, (*); divide, (/); max; and min. 


V pe RI 
{ SEO | { <x, D>, <%, D>, Sy, PCS} 


A.L5.2 Unary Operators 

The next class has one argument and one result, both of the same type, and both 
either integer or real. The members of this class are negation, ’-’; and abs. 

ee : gas ae serene MRE 

{ SEO | { <x, p>. “yy, p>} ES) 

A.1.5.3 Relational Operators . 
The next class has two arguments and one result. The arguments must be the same 
type, and be integer or real. The result is a boolean. The members of this class are 
><, >=, and <=, 


V pe RI 
{ SEO | { <x. D>, <X, p>, <¥y, BOOL> } CS} 
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A.1.5.4 Exception Predicates . 

The fourth and final class of real/integer operations has one argument and one 
result. The argument can be real or integer, and the result is a boolean. The 
members of this class are is~pos—over, is—neg—over, is—unknown, : 
is — zero —divide, is— over, and is—arith—error, 


V pe RI 
{ SEO | { <x, p>. <4, BOOL> } C S$ } 


A.1.5.5 Special Cases 

There are five operations that operate on real and integer types which do not fit into 
the above classes. The first of these special cases is mod, with two arguments and 
one result, all of which are integer. i 

{ SEO | { <x,, INT>, <x,, INT>, <y,, INT> } © S } 


The second special case is exp (which computes x;"2), with two inputs and oe 
result. If x, is REAL then all are real, and if y, is INT then all are integers. 

{ SEO | { <x,, REAL», <x,, REAL, <y,, REAL> } € S } 

{ SEO | { <x,, INT>, <x,, INT>, <y,, INT> } © S } 

{ SEO | { <x,, READ, <x,, INT>, <y,, REAL> } € S } 


The final three special cases are is— pos— under, is— neg —under, and is—under, with . 
one input (a real) and one output (a boolean), =... - > -- -- 

{ SEO | { <x,, "REAL">, <i(y),, "BOOL"> } € S } 

A.1.6 The empty operation 

The empty operation has no inputs, and one result: a string or an array. There is a 
“dummy” node called z which is used for technical reasons: 

{ SEO | <y,, ARRAY) =, <> } 


{ SEO | { <y,, STREAM> =, <2 } 
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A.1.7 Array Operators 


A.1.7.1 Array-fill 
The array-fill operator has three inputs and one output. The first two inputs are 
integers, and the output is an array of type X3. 


{ SEO | { <x,, INT>, <x,, INT> } CS, 
and_<y,, ARRAY> =, <x,> } 


A.1.7.2 Select 

The se/ect operator ([]) has two inputs, an array and an integer. and an output, an 
element of the array. 

{ SEO | { <x,, INT> } CS, and <x,, ARRAY> =, <y,> } 

A.1.7.3 Append 

The append operation takes three inputs and gives one result. The first input, the 
last input, and the output are all arrays of the same type. The second input is an. 
integer. | 


{ SEO | { <x,, IND} CS, 
and <x,, ARRAY> =, <x,, ARRAY> =, <y,, ARRAY> } 


A.1.7.4 Create-by-elements 

The create-by-elements operator [:] is takes n>] inputs and gives one result. The first 
input is an integer, the output is an array of the second input. The rest of the inputs 
must be the same type as the second input. | —~s-* - | 

{ SEO | { <x,, INT> } CS, and <x,>, <y,, ARRAY> for i€{2,...,n} } 
A.1.7.5 Array To Integer Operators 

The following three operators have the same constraints: array-limh, array-liml and 
array-size, They take an array input and give an integer result. We need a dummy 
node named z. 

{ S€O | { <y,, INT> } CS, and <x,, ARRAY> =, <zZ> } 
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A.1.7.6 Array-adjust 

The array-adjust operator takes three inputs and gives an output. The first two 
inputs are integers. The last input and the output are arrays of the same type. 

{ seo | { <x,, INT>, <x,, INT> } © S, and <x,, ARRAY =, <y,> } 
A.1.7.7 Array-addh and Array-addl 

The operations array-addh and array-add! both take two inputs and yield an output. 
The first input and the output are arrays of the the second input’s type. 

{ SEO | <x,, ARRAY> =, <x,> =o <y,, ARRAY> } 

A.1.7.8 Array-remh and Array-reml 

The operations array-remh and array-reml both take one input and give one output. 
The input is an array of the output’s type. 

{ SEO | <x,, ARRAY> =, <y,> } 

A.1.7.9 Array-setl and Array-seth 

The operations array-setl and array-seth take an array and an integer and give an 
array output. The first input and the output are arrays of the same type. 

{ S€O | { <x,, INT> } © S, and <x,, ARRAY> =, <y,, ARRAY> } 
A.1.7.10 Concatenate and Join 

The operations concatenate and array-join takes two arrays, and give one array, all of 
the same type. | 
{ SEO | <x,, ARRAY> =, <x,, ARRAY> =, <y,, ARRAY> } -. 
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A.1.8 Stream Operations 


A.1.8.1 Stream Creation 

The stream operator allows n inputs and one output. There is really one operator 
for every non-negative number n. (We will assume that there is at least one input. 
If not, we need a dummy input, which we can call x,) The inputs must all be the 
same type, and the output is a stream of that type. 

{ SEO | <x,> =o <y,, STREAM> for i € { 1,...,0 } } 

A.1.8.2 Stream Null 

The null operator takes a stream and returns a boolean. We need a dummy node 
named z. 

{ S€O | <y,, BOOL> € Ss, and <x,, STREAM> =, <> } 

A.1.8.3 Stream First 

The first operator takes a stream[T] and returns a T. 

{ S€O | <x,, STREAM) =, <yp>} 

A.1.8.4 Stream Rest 

The rest operator takes a stream and returns a stream of the same type. We need a 
dummy node z to describe this restriction. 

{ SEO | <x,> So <y>, and <x,, STREAM> &, <2 } 

A.1.8.5 Stream affix 

The affix operator takes a stream[T] and a T, and returns a stream{T]. 

{ SEO | <x, STREAM> =, <x,> =, <y,, STREAM> } 


A.1.9 Record Operators 


A.1.9.1 The Record Constructor 


The record, Operator takes n inputs and gives one output. Note that there is 
| Perey n 


_one record operator for every finite set of VIMVAL identifiers. Assume that a, ..., 


a, are sorted Iexicographically. We must be sure to exclude other selectors on the 
output. _ 


{ S€O | <y,, GET-a,> =, <x,;> for i€ {1,...,n}, 
and <y,, GET-B,...> € S if BE { a,, ..., a, } } 


A.1.9.2 Record Selection 

The select, operation on records takes a record and gives a value which was stored 
in the record. Note that we must be careful to allow paths that start with GE7-B, for 
all B#a, because the select, path does not say anything about the other selectors. 

{ SEO | <x,, GET-a> =, <y,> } 

A.1.9.3 Record Replace 

The replace ” operation on records takes a record and a value, and returns a new 
record of the same type. 

{ SEO | <xX> Sy <yp>, and <x,, GET-a> =, <x,> } 

A.1.10 Union Types 


A.1.10.1 Union Make 

The make, operator takes an object and returns a oneof. 

{ SEO | <x,> =, <y,, IS-a> } 

A.1.10.2 Union Is 

The is 'y Operator takes a oneof and returns a boolean. We need a dummy node 
named z, | 

{ S€O | <x,, IS-a> =. <a, and <y,, BOOL> € S } 
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A.L.11 Constants 


Integer, real, and character constants have no inputs and one output. The output 


must be the type of the constant. ee. 
Real {Seo |<y,, REAL €S} 
- Integer {Seo|< yy IND ES} 


Character ~—----{ S€o| <y,, CHARD ES } 
A.2 Type Declarations 


Variables and Formal arguments may have type information explicitly given about 
them through a type specification. The type specification is treated just like an 
expression for the purposes of typing. | . 

A.2.1 Basic Type Specifications 

Reals, integers, characters, booleans, and null can each be specified by their names, 
which have selectors associated with them: REAL, INT, CHAR, BOOL, and NULL 
respectively. If we see a basic type a, with selector B, then there is only one 
“output”, and that is the type a. 

{ SEO | <y,, BO ES } 

A.2.2 Array and Stream Type specifications 

If we see a type specification ARRAY[A], where A is a type specification, then we 
say the "output" is an array of A, and the "input" is A. -- 7 

{ SEO | <y,, ARRAY> =, <x) } 


* Similarly for streams: 
{ SEO | <y,, STREAM) =, <x,> } 


A.2.3 Record and Oneof Type specifications 


If we see a record type specification RECORD[a,:A,, ay aA) thén we treat it 


exactly the same as the record constructor in section A.1.9.1. 


Similarly for oneof type specifications: There is no oneof constructor that specifies 
all the arms, but it should be treated like a record constructor, just replace all the 
GET-a’s with IS-a’s: 


{ SEO | <y,, IS-a;> Sy <x;> for i € {1,...,n}, 
and if <y,, IS-B, ...> € R, then B = a, for some j } 


A.2.4 Function Type Specifications . 
Function type specifications are treated just like function applications in section 


A.4.1, Instead of having subexpressions, we have subtypes. 


A.2.5 Free Variables as Type Specifications 


A free variable just names a single node, as is true for any VIMVAL expression. 
A.3 Basic Constructs 


A.3.1 If then else 

The if then else operator appears in the form: 
IF <exp1> THEN <exp2> ELSE <exp3> ENDIF 
We require that <expD be a boolean 1-valued expression, <exp2> and <exp3> be - 
. m-valued expressions, where <exp2>, is the same type as <exp3>, for i=1 through 


m. The IF is a m-valued expression. 


We label the <exp1> node x, ,, the <exp2> nodes Xo Xoo the <exp3> nodes 


X3 ys om X3 ns and the result nodes y,, ..., Yn" 


{ SEO | <x, ,, BOOL> € s, 
and <X, 4> =x X34? =, <y;> for i in {1,...,m} } 
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A.3.2 Tagcase 


The fagcase construct appears in the form: 


TAGCASE <exp> 
TAG a, (n,): <Sexp,> 
TAG G@, (Ny): <expy> - 
TAG a, (ng): <exp,> 

{ OTHERWISE : <exp,,.> } 
ENDTAG 


The requirements are that <exp)>..<exp, 4 > are the same type, and T(<exp>) must 


n+] 
be a oneof type with a,..a, as tag values. (If the OTHERWISE is not included, 
then there must be no other tag values.) The value of a TAGCASE can be a m- 


valued expression. 


We label the node of <exp> as x9, the nodes of <exp,> as x for j=1,..,m. The 


resulting nodes of the-tagcase are Yj for j= 1...,.m. 


If the OTHERWISE is included we have: 


{ SEO | <exp,> =, <y,> for i in { 1,...,nt1 }, 
and <exp, GET-a,> = <n,> } 


If the OTHERWISE clause is not included, add the extra restriction that there are 
no other tags: 


{ SEO | <exp,> =, <yy> for i in { 1,...,n+1 }, 
and <exp, GET-a,> = <n,>, 
and if <exp,GET-B, ...> € S, then B = a, for some i } 
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A.3.3 Forall construct - 
The forall construct appears as: 
FORALL <var> IN [ <exp1> , <exp2> ] 


CONSTRUCT or EVAL <exp3> 
ENDALL 


There are two cases, the CONSTRUCT and the EVAL case: In every case, <exp 
and <exp2> must be integer. We label <exp1>’s node X,, <exp2>’s node x,, and 
<exp3>’s node x, The result node is y,. 

A.3.3.1 Forall with CONSTRUCT 

The restrictions for the CONSTRUCT case are that if <exp3> is of type T, then y, iS 
type ARRAY[T]. 

{ SEO | { <x, INT, <x, INT> } © S, and <x,> =, <y,, ARRAY> } 
A.3.3.2 Forall with EVAL . 

There are six possible "evaluation operators” for the EVAL clause of a forall 
statement. In each case we have the additional restriction that the typeof exp3 must 


be the same as the type of the output. 


There are more restrictions, based on which evaluation operator is used: 


In the case of +, * ,min, or max we have the restriction that <exp3> must be an 


integer or a real: 


V pe Ri 
{ SEO | { <x,, INT>, <x,, INT>, <xX5, D>. <¥y. PP FES} 


In the case of &, or or we have the restriction that <exp3> must be boolean. 
{ S€O | { <x,, INT>, <x,, INT>, <x,, BOOL>, <y,, BOOL>} € S$ } 
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A.4 Functions 


There are two ways that a function is encountered in VIMVAL. The first is the — 
declaration of a function, which is first treated by the compiler to get rid of 
polymorphism and recursion. The second is when the function is passed as an 
argument (either to a built in operator such as apply, or as another function). 

A.4.1 Function Declaration 

recursion, the type checker sees a “function declaration" node, which we can write 
as | . 
FUNCTION(@,,...,@,) RETURNS (B,.---+8,,) <EXPRESSION> END FUNCTION 


where the a,’s and B's actually are node names of nodes inside <expression>, We 
assume that <expression> is m-valued, and that B; is the name of jth output node of 
<EXPRESSION>. The resulting type constraints of a function declaration is that 
the output is a function taking n values, such that the ith value is of type a;, and 
returning m values, such that the jth returned value is of type B;. y, tefers to the 
node of the actual function. 


{ REO | <y,, ARG-D =, <a>, 
and <y,, RET-> =, <B,> for appropriate i's } 


A.4.2 Function Application 

If we see 

<exp>(<exp,>, sexe se <exp,>) 

then we have a function application. The requirements are that <exp,> be the same 
type as the ith argument of <exp>, and that the jth output of this function 
application is the type of the jth return value of <exp>. Here <exp> is labeled xX}. 
and <exp,> is labeled Xj The outputs are labeled Jj for appropriate values of j. 

{ SEO | <x, ARG-i> = <x,,.>, and <x,, RET-i> =, <y,> } 
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Examples of the power of VIMVAL 


Program example 5-3 composes two functions to give a new one: One weakness in 
our type system is that one can not write a function which takes an arbitrary number 
of arguments. (This weakness is a result of the syntax of VIMVAL, rather than the 
type system itself.) 

Program Example 5-3: 


function compose (F:functype(B) returns (C). 
G:functype(A) returns (B)) 
returns (functype(A) returns (C)) 
function composer (aval:A) returns (C) 
F(G(aval)) 
endfun % composer 
composer % return the composer 
endfun % compose 


Program example 5-4 implements the same function, with type inference instead. 
Program Example 5-4: 


function compose (F,G) 
function composer (aval) 
F(G(aval)) 
endfun % composer 
composer 
endfun % compose 


Program example 5-5 shows how a multiplier, the encapsulation of multiplication by 
a constant, can be implemented in VIMVAL: 
Program Example 5-5: 


% MakeMul takes an integer I and returns a 

% function which multiplies integers by I 

function MakeMul(i:INT) returns (FUNCTYPE(INT) returns (INT)) 
function dolt(j: int) returns (int) i*j endfun 
doIt % return doit 
endfun 
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Program example 5-6 shows how the multiplier in example 5-5 can be written 
without explicit type declarations. This example is slightly more powerful, in that in 
can operate on reals or integers. 

Program Example 5-6: 


function MakeMul{i) 
function dolt(j) i*j endfun 
doIt 
endfun— 


Program example 5-7 demonstrates a “password hider” program, which can be used 
to hide information, which will only be released upon presentation of the correct 
password. See [18] for further details on this sort of protection. 

Program Example 5-7: 


type hider=functype(givenpass:T, 
command :oneof[store:T; fetch]) 
returns(oneof[badpass; 
didstore:hider; 
didfetch:T]) 
type pfuntype = functype(T,T) returns(boolean) 


function makePassword(password:T, 
passfun:pfuntype, 
hiddenObject:T) 
returns (hider) 
% makePassword returns a function which knows the password and knows the 
% hidden object, but will not reveal the hidden object unless the user 
% presents the correct password. There is also no way to uncover the 
% password itself, except by subverting the type system, e.g. using 
% a debugger (or perhaps by trial and error). 
function doIt(givenpass,command) 
% doIt is the function that is returned by makePassword. doIt 
% knows the password, because the password is in doIt's lexical 
% scope. 
% dolt returns the value iff the password presented causes 
% PASSFUN(PASSWORD,GIVENPASS) to return true. 
if ~passfun(password,givenpass) then 
make[BadPass:nil] 
else 
tagcase o:=command 
tag store: make[DidStore:makePassword(password,passfun,o) ] 
tag fetch: make[DidFetch:hiddenObject] 
endtag 
endif 
endfun % doIt 
doIlt % return dolt 
endfun % makePassword 
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Finally, we have an example which implements lisp primitives in VIMVAL. 


Program Example 5-8: 


function cons(a,b) 
make[ConsVal:record$[car:a,cdr:b]}) 
endfun % cons 


% The car of null is null 
function car(a) 
Tagcase b:=a 
tag Consval: b.car 
tag Nullval: a 
endtag 
endfun % car 


% the cdr of a null is nul) 
function cdr(a) 
TagCase b:=a 
tag Consval: b.cdr 
tag Nullval: a 
endtag 
endfun % cdr 


function nulip(a) 
is Nul1Val(a) 
endfun % nulip 


function lispnil() 
make[nullval:nullj 
endfun % lispnil 


function length(a) 
if nullp(a} then 0 
else 1+length(cdr(a)) 
endif 
endfun % length 


function append(a,b) ; 
if nullp(a) then b —_. Debi. sete wsse cesses Sigg Pe Sets 
else cons(car(a), append(cdr(a), b)) 
endif 
endfun % append 


function ith(a,/) 

if i>0 then ith(cdr(a),i-1) 
else car(a) 

endif 

endfun % ith 
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function reverse(a) 
% doreverse returns the first i elements of a in reverse 
function doreverse(a,i} 
if i=0 then lispnil() 
else cons(ith(a,i), doreverse(a,i-1)) 
endif 
endfun % doreverse 
doreverse(i, length(a)) 
endfun % reverse 


Appendix C © 


Listing of the VIM-VAL type checker 


This appendix contains a listing of the VIM-VAL type checker which is written in 
the CLU [13] programming language. The style is “functional”, i.e. we have been 
careful to avoid side-effects, so that the eventual translation of the VIM-VAL 
compiler into VIM-VAL will not be too painful. 


SOTA A cluster which implements the MFSA defined in definition 4-5, 
along with its operations and the predicates which can be used to 
determine type correctness. 

SET A cluster which implements the mathematical object set. 


EQUIVREL A cluster which implements equivalence relations. 


MAP A cluster which implements maps from one set of objects to 
another set of objects. 


SOTATEST A procedure which tests SOTA. 
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#extend - 
sota = cluster[alphabet, nodename, classname:type] 4s 
create, equate, has_subpath_to, has_path, has _closed_path, 
close, 
get_unique_type_assignment, 
export % for debugging only. 
where 
alphabet has get_class:proctype(alphabet) returns(classname), 
equal:proctype(alphabet, alphabet) returns(bool), 
get_is_terminator:proctype(alphabet) returns(bool), 
% requires: if two alphabet items A and B then 
% A.class=B.class implies A.is_terminator=B. is_terminator 
nodename has equal :proctype(nodename, nodename) returns(bool), 
classname has equal:proctype(classname, classname) returas(bool) 


abstract = sotafalphabet,nodename,classname} 
rep = struct{equivs:ERNN, 

closures: TNSA, 

transitions: tntano] 
ERNN=EquivRel[NodeName } 
TNSA=map[NodeName , SA] 
SA=Set[Atphabet] 
tntano = map[NodeName, tano] 
tano = map[alphabet, no] 
no = oneof[acceptor:null, 

node: nodename } 

nopair«struct[first,second:no] 
x nodepairestruct[first, second: nodename] 
agenda=set[nopair } 


% representation invariant I(R) 
' % R.equivs agrees with R.transitions: i.e. 
Equivrel[NodeName ]$Equivalent(R.equivs,n,m) implies 
R.transitions[n] = r.transitions[m] 
R.transitions preserves well-typeness: 1.@. 
R.transitions[n][a] and R.transitions[n][b] are defined implies 
a.class*b.class 
R.closures agrees with R.transitions: i.e. 
If R.closures[n] is defined then 
Domain(R.transitions[n]) Is a subset of R.closures[n] 


abstraction function R corresponds to A iff 

equivrel[NodeName ]Sequivalent(r.equivs,n,m) iff: 
for all Q in A, <n) is state-equivalent to <m 

equivrel[nodename ]$equivalent(r.equivs,m, ; 

no$value_node(r.transitions[n][a])) iff 

for al? Q in A <n,a> is state equivalent to <@> . 

no$Sis_acceptor(r.transitions[n]{a]) 1ff. —-.........-------.- 
for all Q in A <n,a> is in Q 
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create = proc() returns(cvt) 
% returns the set of all type assignments 
return(rep${equivs: ERNNSCreate(), 
closures: TNSAScreate({), 
transitions: INTANO$create()}) 
end create 


equate = proc(os:cvt, nodei,nodej:nodename) returns(cvt) signals(empty) 
% returns OS[nodeienodej], (signals empty if there is none) 
4f ERNNSEquivalent(os.equivs, nodei,nodej) then return(os) end 
ttd: agenda: -agenda$[nopair$(first :no$make_node(nodei), 
second: no$make_node(nodej)}] 

% ttd: things to do, but these things have to be checked for 
x compatability AND put into the equivrel 
Newequivs:ernn:"08.equivs 
while ~agendaSis_empty(ttd) do 

nowdo:nopair 

nowdo, ttd: sagenda$pick_rest(ttd) 

% the first thing to check is previous equivalence. If they 
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% are already equivalent, then we don't need to add more. 
% after that, we should check for compatibility. The class of the 
% labels on the optput transitions should be the same. We really 
% only need to test one of them from each node. 
% After that, we should gather a list of the ones that should be 
% made if these two nodes are equivalent, This really must be 
% done for the whole class of them 
if no$is_node(nowdo.first) cand noSis_node(nowdo.second) thea 
nowdol :nodename: =no$value_node(nowdo. first) 
nNowdo2: nodename : =no$value_node(nowdo. second) 
df ~ernnSequivalent(newequivs, nowdol, nowdo2) then 
% now we actually have to equate them, but are they compatible? 
1% ~compatibie(os,nowdol,nowdo2) 
_---_ then signal empty ead 
ye % we must go to the mapping and add stuff 
/ ttd:=ttd | pairs_which_must_be_same(os, newequivs[nowdo1], 
; newequivs[nowdoz })} 
newequivs:sernn$equate(newequivs, nowdol, nowdo2) 
end 
elseif no$is_node(nowdo.first) cor no$is_node(nowdo. second) 
then signal empty end 
end 
% built up newequivs, but not done yet . 
% now we have to actually create the new object to return 
% we must extend the old maps : 
% (not because newequivs does not partition everything correctly, it 
% does, but because we can only get the non_trivial_classes out, and 
% that is not everything) 
rettrans:tntano:=os.transitions 
retclos:tnsa:*os.closures 
for eclass:set[NodeName] in ernn$non_trivial_classes(newequivs) do 
everclosed:bool:=false % did we ever hit a closure for this class? 
thistran:tano:=tano$create() 
thisclose:sa:=saScreate({) 
for elt:nodename tn set[nodename]Selements(eciass) do 
for al:alphabet,n:no fn tano$entries(os.transitions[elt]) do 
thistran:=tano$def ine_override(thistran,al,o) 
end except when undefined: end 
begin 
if everclosed then 
thisclose:=thisclose&os.closures[elt] 
else 
thisclose:=os.closures[elt] 
everclosed: =true 
end 
end X this is so we can keep track of if we closed it 
except when undefined: end 


% check for the closure restriction one lTast time 
4f everclosed cand ~tano$domain_is_in(thistran, thisclose) 
then signal empty end % not an error if never closed 
for elt:nodename in set[nodename]$elements(eciass) do 
rettrans: =tntano$define_override(rettrans,elt,thistran) 
if everclosed then 
retclos: =tnsa$def ine_overr ide(retclos,elt, thisclose) 
end % don't define unléss we actually closed it 
ond 
end 
return(rep${equivs:newequivs, 
closures:retclos, 
transitions: rettrans}) 
end equate ; 


X% internal routine decides if two nodes are compatible. 

% does check the closure condition 

% we have to do is look at a rep from the domain of the transitions to 

% see if they are the same class. If there is none, then its ok on this. 
X% we also have to check the clsoure condition 

% check that both of these are true: 
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x os.closures[n1] is undefined or contains domain(os.transitions[n2]) 
x os.clusures[n2] is undefined or contains domain(os.transitinos[n1]) 
% if they are both defined then this is equivalent to testing 
x if the intersection of the closure conditions contains the union - 
z of the domains (This is equivalent because we already knew 
z that the closures contained the domain of their own functions 
compatible = proc(os:rep, n1,n2:nodename) returns(bool) 
ti:tano:*os.transitions[al1] 
except when undefined: t1:*tano$create() end 
t2:tano:*os.transitions[n2] 
except when undefined: t2:*tano$create() end 
4f tano$pick_from_domain({ti).class ~*= 
tano$pick_from_domain(t2).class 
then returna(false) end 
4@xcept when none: end % ok so far 
begin 
" el:sa:*os.closures[nil] 
if ~tano$domain_is_in(t2,c1) then return (false) end 
end except when undefined: 
end % it is ok if os.closures[n1] is undefined 
begin 
c2:sa:*o0s.closures[n2] 
41f ~tano$domain_is_in(t1,c2) then return(false) end 
end except when undefined: end X% it is ok 
return(true) 
end compatible 


internal routine which returns a set of pairs that must be the same 

if the elements of si and s2 are to be the same under a modified OS. 

the reason we don't accept the union of si and s2 ts that we 

would have to return al) the pairs in (S1]S2) CROSS (S1{S2), 

which is no fun. 

this way, we won't have to return any such pairs, which speeds things up 

(of course, we can if we want to, no guarantees here.) 

the pairs that we return are the ones where 
airs_which_must_be_same = proc(os:rep, s1,82:set[nodename]) 
returns( agenda) 
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% let s:<si union $2 
x for each element ins 
r 4 
% for each element in s1 
retset: agenda: =agenda$create() 
Sn: sequence[nodename]: =set[nodename ]$set2seq(s1) 
. || set[nodename ]$set2seq(s2) 
for i:int in sequence[nodename]$indexes(sn) do 
thisname:nodename: =sn[{i] 
thistano:tano:*os.transitions[thisname} 
except when undefined: thistano:=tano$create() end 
for symbol:alphabet 4a tano$domain_iter(thistano) do 
for j:int in int$from_to(i+1, sequence[nodename]$size(sn)) do 
thatname: nodename: =snf{ j] 
% add what you get if you follow SYMBOL from thisname and thatname 
retset:=retsett 
nopair${first:os.transitions[thisname}[symbol}, 
second: os. transitions[ thatname][symbo1]} 
except when undefined: 
end % if a symbol 1s not there, don't worry 


end 
end 
end 
return(retset) 
end pairs _which_must_be_same 


X if has_subpath exists, then we would like this to mean the same thing as 
X a:rep,b:nodename:=*has_subpath(os,node_from, sym) 
X% return{equate(a,b,node_to)) 

% but we don't use the intermediate node name 

% note that in any event, if node_from.is_terminator then signals terminator 
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has. —Subpath_ to = proc(os:cvt, node_from:nodename, sym:alphabet, node_to: nodename) 
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returns(cvt) signals(empty, terminator) 
% if sym is a terminator, then node_to would have to be an accepters. 
% which is impossible 
if sym. is_terminator shee signal terminator end 


% worry about closure first 

if ~sa$elementof(sym,os.closures[node_from]) 
then signal empty end 
except when undefined: end % its ok 


nmap: tano:*os.transitions[node_from] 
except when undefined: nmap:=tano$[J end 
Lee 

% check for the class restriction 

4f tanoSpick_from_domain(nmap).class~=sym.class 
then signal empty end 
except when none: end % it is ok 


already_to:no:=os.transitions[node_from}[sym] 
except 
when undefined: 
% just build the new object and return it 
retura(repS{equivs:os.equivs, 
closures:os.closures, 
transitions: 
tatanoS$def ine_overr ide( 
os.transitions, 
node_from, 
tano$def ine_overr ide( 
nmap, sym, no$make_node(node_to)))}) 
end 
x if it is an acceptor, it can't equate to node_to 
if no$is_acceptor(already_to) then signal empty end 


nat: nodename: =no$value_node(already_to) 

if ernnSequivalent(os.equivs,nat,node_to) then 
retura(os) 
end 


% it is defined, and it meets the closure condition, but the node 

% is not equivalent. Checks again to see if meets the class property 
x inside equate 

return(down(equate(up(os),nat,node_to))) resignal empty 

end has_subpath_to 


has_subpath does the following: - = ~~~ : < 
if os.transitions[nodej[sym] is defined, returns os 
otherwise, checks to see if the transitions that are already there 


are compatible with sym (if not signals empty) 
then creates an anonymous node which is transitioned to 
there if sym.is_terminator then 


x returns the nodename that we go to on sym 
Xhas_subpath = proc(os:cvt, node:nodename, sym: alphabet) 


returns(cvt,nodename) signals(empty) 


% has_subpath is not actually a defined function 


end has_subpath 


x has_path adds path <node,sym> to the transitions 


z 
z 
x 
r | 


if ~sym.is_terminator then you get “non_terminator” signalled 
if sym is incompatible with the current version, signals “empty 


it could either be incompatible with the closure 
or the transition class 


has_path = proc (os:cvt, node:nodename, sym:alphabet) 


returns(cvt) signals(empty,non_terminator) 
% check for a terminator 
4f ~sym.is_terminator then signal non_terminator end 


G-?, 
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% check for the transition already existing, if it does then just 
% return os because it is guaranteed to be an accepting node, because 
% sym.is_terminator is true 
if tano$defined(os.transitions[node],sym) then return(os) end 
except when undefined: end % it is ok 


% check the closure condition 
if ~sa$ElementOf(sym,os.closures[node]) then signal empty end 
except when undefined: end % it is ok 


% check the transition compatiblity 
4f tano$pick_from_domain(os.transitions[node]).class~=sym.class 
then signal empty end 
except when undefined: 
when none: 
end % it is ok 


% now return the new object 
old_tano: tano:*o0s.transitions[node] 
except when undefined: old_tano:=tano$create() end 
new_tano: tano:=tano$def ine_overr ide({old_tano,sym, 
no$make_acceptor(n41)) 
newtntano:tntano:*os.transitions 
for affected:nodename in set[nodename]$elements(os.equivs[node]) do 
newtntano: =tntano$def ine_overr ide(newtntano, affected, new_tano) 
end 
return(rep$replace_transitions(os,newtntano) ) 
end has_path 


% if os can't meet the closure condition, then signal empty 
% otherwise return os with the new closure condition 
close = proc(os:cvt, node:nodename, syms:set{alphabet]) 
returns(cvt) signals(empty) 
if ~tano$domain_is_in(os.transitions[node], eons) 
then signal empty end 
except when undefined: end % it is ok 
% now create the new os 
isyms:sa:=syms&os.closures[node] 
except when undefined: isyms:*syms end 
if saSis_empty(isyms) then signal empty end 
retctosures:tnsa:#os.closures 
eclass:set{nodename]:=os.equivs[node] 
except when undefined: eclass:=set[nodename]$[node] end 
% all the equivalent nodes should have equal maps 
for ntofix:nodename in set[nodename]$elements(eclass) do 
retclosures:=tnsa$def ine_ override(retclosures, ntofix, isyms) 
end 
return(rep$replace_closures(os,retclosures)) 
end close 


% has_closed_path does close(has_path(os,node, sym) ,node,{sym}) 
has_closed_path = proc(os:abstract, node:nodename, sym:aiphabet) 
returns(abstract) signals(empty) 
return(close(has_path(os,node,sym), node, sa$[sym})) 
resignal empty 
end has_closed_path 


% returns the map, which describes the transition function for the 

% fsa which accepts the type assignment. 

% signals ambiguous if any of the nodes named dont have some transition 
leading away from them. Nodes can be named in closures, equivs, or 
they could have transition functions which are undefined everywhere 

also signals ambiguous if the closure of a node is not exactly 
equal to the domain of the of the transition function. This 
has two special cases: 

1) a node does not have a closure (nodes 
without a closure are ambiguous) 

2) a node has a closure, but some element of the closure does not 
have a transition. 
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% {we are guaranteed that the domain of the transition is in the closure) 
get_unique_type_ assignment = : 
proc(os:cvt) 
returns (map[nodename.map[alphabet,oneof[acceptor:aull ,node:nodename ]}}) 
signals(ambiguous) 
% check for ambiguity by finding mentioned nodes that are never used 
% several ways for it to be ambiguos: an entry could have a tano 


% with no entries, or there could be a named node somewhere with 

z not entry in transitions 

% or there could be a node mentioned in the closure that has no entry 
% in transitions 

% or there could be a node named in equivs with no entry in 

z transitions 

for nname:nodename,ntano:tano in tntano$entries(os.transitions) do 


v% if the named node does not have a closure then ambiguous 
i myclosure:sa:*os.closures[nname] 
except when undefined: signal ambiguous end 
% if any of the symbols in the closure don't have a transition 
% then ambiguous 
for symindom:alphabet in sa$elements(myclosure) do 
tano$fetch(ntano,symindom) 
end 
except when undefined: signal ambiguous end 
% if the named node has a completely undefined transition 
x function then then ambiguous 
tano$pick_from_domain(ntano) 
except when none: signal ambiguous end 
% if any of the nodes in range of the transition 
% dont have closures or have undefined transition 
% functions then ambiguous 
for sym:alphabet,arsit:no in tano$entries(ntano) do 
tagcase arsit 
tag acceptor: % do nothing 
tag node(nto:nodename): 
X% % if the node does not have a closure then it is 
2% % ambiguous 
%% tnsaSfetch(os.closures,nto) 
%% except when undefined: signal ambiguous end 
%% note: all the nodes are checked for this 


% if there is no transition from nto, to another node 
% it is ambiguous 
tano$pick_from_domain(os.transitions[nto]}) 
‘ except when undefined,none: signal ambiguous end 
end 
end 
ead 
% if any of the nodes mentioned in the equivalence classes ores tee a pom enmiestmme nes sae 
% dont have closures or have wauetined transitions : 
% then ambigous 
for nt_classes:set[nodename] in ernn$non_trivial_classes(os.equivs) do 
for mentioned:nodename in set[nodename]Selements(ntclasses) do 
tnsa$fetch(os.closures,mentioned) 
except when undefined: signal ambiguous end 
tano$pick_from_domain(os.transitions[mentioned]) 
except when undefined,none: signa? ambiguous end 
end 
end 
% if any of the nodes mentioned in the closures 
% dont have closures or have undefined transitions 
% then ambiguous 
for c_node:nodename in tnsaSdomain_. iter(os.closures) do 
tnsa$fetch(os.closures.c_node) 
except when undefined: signal ambiguous end 
tano$pick_from_domain(os.transitions[c_node]) 
except when undefined,none: signal ambiguous end 
end 
return(os. transitions) 
end get_unique_type_ass ignment 
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4 export returns a copy of the internal representation for os 
% note that since everything is functional, this is perfectly safe 
export = proc(os:cvt) returns(rep) 
return(os) 
end export 
end sota 
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#extend 
set = cluster[t:type] 4s 
create, new, % these are the same 
add, % add a new element 
contains,gt, % these are the same 
elementof, % other direction for contains 
mem, % does some element of a set satisfy a predicate 
elements,cons,pick,pick_rest,is_empty, % misc 
equal, % are they the same set? 
union,or, % these are the same 
intersection, and, % these are the same 
sub, % set subtraction 
set2seq 
where t has equal:proctype(t.t) returns (bool) 


rep = sequence[t] 
% create the empty set 
new = proc() returns(cvt) return(repS[]) end new 


% add an element 
add = proc(s:cvt, el:t) returas(cvt) 
4f up(s)>e] then return(s) else return(repSaddh(s,a1)) end 
end add 
Tow: int: #1 
high: int: =rep$size(s) 
while low<=high do 
t:int:=(low+high)/2 
if s[i]J=e] then return(s) 
elseif s[i]<el then low:*it1 
else high:#i-1 
end 
end 
return (rep$subseq(s,1,high) 
{| repS[el] 
[| rep$subseq(s, low, rep$size(s)-high)) 
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end add 


% membership operator 
gt = proc(s:cvt, el:t) returns(bool) 
for elin:t in repSelements(s) do 
if elin=el then return (true) end 
end 
return(false) 
Tow: int: #1 
high: int: =rep$size(s) 
while low<shigh do 
Fs int:(lowthigh)/2 
if sfij=e] then return(true) seme nis nreman 
elseif s[ij<el then low:et+1 
else high:*i-1 
end 
end 
return (false) 


FE Be DE FE PE DL DE DE DE DE 


e 
> 
a 
2 
cod 


% the other name for the membership operator 
contains = proc(s:set[t], e1:t) returns(bool) 
return(s>e1) 
end contains 


% the other direction for the membership operator 
elementof = proc(el:t, s:set{[t]) returns(boo1) 
return(s>el) 
end elementof 


% return true iff there is an element K in S, such that PRED(EL,K) 


mem = proc(ei:t, s:set[t], pred:proctype(t,t) returns(bool)) returns(boo!) 


for knowne?:t in set[t]Selements(s) do 
if pred(el,knownel) then return(true) end 
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end 
return(false) 
end mem 


elements = iter(s:cvt) yields(t) 
for e:t in rep$elements(s) do yleld(e) end - 
end elements 


cons = proc(s:sequence[t]) returns(set[t]) 
retval:set[t]:=set[t]$[] 
for e:t in sequence[t]$elements(s) do 
retval:=retvalte ; 
end 
return(retval) 
end cons 


pick = proc(s:cvt) returns(t) signals(empty) 
return(s[1]) except when bounds: signal empty end 
end pick 


pick_rest = proc(s:cvt) returns(t.cvt) signals(empty) 
return(s[1]},rep$rem1(s)) 
except when bounds: signal empty end 
end pick_rest 


is_empty = proc(s:cvt) returns(bool) 
return (rep$empty(s)) 
end is_empty 


% two sets are the same if they have exactly the same elements 
equal = proc(si,s2:cvt) returns(bool) 
if si=s2 then return(true) end % might as well optimize 
1f rep$size(si)~=rep$size(s2)} then return(false) end 
for el:t in elements(up(si)) do 
4f up(s2)~>e1 then retura(false) end 
end 
% everything in s2 is in $1, and they are in 1-1 correspondance, so 
return( true) 
end equal 


or = proc(si,s2:set[t]) returns(set[t]) 
for el:t in elements(si) do 
$2:*s2+e] 
end 
return(s2) ; 
sizel:int:*rep$size(s1) 
size2:int:=rep$size(s2) 
retval:array[t]:earray[t]$predict(1,sizeit+size2) 
dadxi:int:s1 : ; 
indx2:int:#1 
while indxi<*sizel cand indx2<#size2 do 
tf sifindx1]=s2[indx2] then 
array[t}$addh(retval,sifindx1]) 
tindxi;sindxitt 
indx2:sindx2+1 
elseif sifindx1]<s2[indx2] then 
array[t]$addh(retval,sifindx1]) 
indxi;:=indx1+1 
else 
array[t]$addh(retval, s2[indx2]) 
indx2:sindx2+1 
end 
end 
% one of the indx's is over 
if indxi<sizel then 
return(rep$a2s(retval) | |rep$subseq(s1, indx1+1,size1-indx1)) 
elseif indx2<size2 then 
return(rep$a2s(retval)| {rep$subseq(s2, indx2+1,s1ze2-indx2)) 
else return(rep$a2s(retval)) 
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F 4 end 
end or | 


union = proc(si,s2:set[t]) returns(set[t]) return(sljs2) end union 


and = proc(si,s2:set[t]) returns(set[t]) 
retset:set[t]:=*set[tJ$[] 
for e1:t in elements(si) do 
if s2>e1 then retset:=retsett+el end 
end 
returna(retset) : 
sizel:int:*rep$size(s1) 
x size2:int:*rep$size(s2) 
x retval:array[t]:2array[t]$predict(1, int$min(sizel,size2)) 
z indxi:int:=1 
z indx2;:int:*1 
x while indxi<=*sizel cand indx2<ssize2 do 
x if sifindxiJ=s2f[indx2] then 
x array[t]$addh(retval,sifindx1]) 
x tindxi:=indx1+1 
z indx2:*indx2+1 
z elseif sifindx1}<s2[indx2] then indx1:=indxi+1 
Zz else indx2:*indx2+1 : 
z end ; 
r 4 end 
x return(rep$a2s(retval)) 
end and 


intersection=proc(s1,s2:set[t]) returns(set{[t]) return(si&s2) end intersection 


sub = proc(si,s2:set[t]) returns(set[t]) 
retset:set[t]:=*set[t]$[] 
for el:t in elements(si) do 
if s2~>e] then retset:=retsettel end 
ond 
sizel:int:=rep$size(s1) 
size2:int:=rep$size(s2) 
retval:array[t]:"array[t]$predict(1, int$min(sizel,size2)) 
indxt:int:=1 
indx2:int:32 
while indxi<ssizel cand indx2<ssize2 do 
if sifindxi]=s2[indx2] then 
indx1;=tndx1+1 
tndx2:«indx2+1 
elseif sifindx1]<s2[indx2] then 
array[t]$adch(retval,sif{indx2]) 
indxi:=indxiti : 
else indx2:=indx2t1> 
end 
end 
if indxi<sizel then 
return(rep$a2s(retval)|{rep$subseq(si, indxi+1,size1-indx1)) 
else return(rep$a2s(retval)) 
end 
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nad sub 


set2seq = proc(s:cvt) returns(sequence[t]) 
return(s) 
end set2seq 


create = proc() returns(cvt) 
return(rep$new( )) 
end create 

end set 


1s 


oe 


ps:<kuszmaul. thesis. valclu>equivrel.clu.14 21 April 1984 14:00:57 . , Page 1 


#extend 
equivrel = cluster[T: type} is 
create, equate, non_trivial_classes,fetch,equivalent,cons ,new,equal 
where T has equal:proctype(t,t) returns (bool) 
% this is immutable 
rep=map[T,set[{T]] 
% abstraction function A(e:rep), if e[x] is defined, then x is in 
% the class with elements of e[x]. If e[x] is undefined, x is in { x } 
% by itself ; : : 


% rep invariant R(r:rep) if r[x] is defined then [r[x]j>1 and 
% x is in r[x], and for al? y in r[x] r[y] is defined 


% return an equivalence relation with no relations. 
% every element of T has it’s own class 
create = proc() returns(cvt) 

return(repScreate()) 

end create 


% create an equivalence relation with the added relationship valisvalj 
equate = proc(er:evt,vali,valj:T) returns(cvt) 
if set[T]$ElementOf(valj,up(er)[vali]) then return(er) end 
newclass:set[T]:*set[T]$Union(up(er)[vali],up(er)[valj]) 
for affected:T in set[T]$elements(newclass) do 
er:=repS$def ine_override(er,affected,newclass) 
end 
return(er) 
end equate 


% yield al? the classes which have more than one element in them 
% watch out! This does not yield al? the classes because there 
% is no way to generate a complete list of T. Anything 
ZX not yielded is in its own class 
non_trivial_classes = iter(er:cvt) yields{set[T)}) 
did:set[T]:=set[{T]S$create() 
for elt:T,i:set[T] tn repSentries(er) do 
if ~set[T]SElementOf(elt,did) then 
did: =did+elt 
yield(i) 
end 
end 
end non_trivial_classes 


% returns the class that val is tn 

fetch = proc(er:cvt, val:t) returns(set[t]) 
% if t is not defined, then return set[t]${val] 
return(er[val]) except when undefined: retura(set(t]$[val]} end 
end fetch : 7 ; MA EE ae 


x if vali is in er[valj] then return true, else false 
equivalent = proc(er:equivrel[T], vali,valj:t) returas(bool) 
return(set[T]$elementOf(vali,er({valj))) 
end equivalent 


new = proc() returns(equivrel[T]) return(create()) end new 


cons = proc(ss:sequence[set[T]]}) returns(cvt) signals(not_well_def ined) 
ret:rep:=rep$create() 
for cl:set[T] in sequence[set(T]]$elements(ss) do 
for et:T in set[T]$elements(cl) do 
ret: =repSdefine(ret,el,cl) 
except when already_defined: signal not_well_defined end 
end 
end 
returna(ret) 
end cons 


% this depends on the fact that there are no singletons! 
equal = proc(a,b:cvt) returas(bool) 
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return(a=b) 
end equal 
end equivrel 


aA 
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map = cluster[domain,range: type] is 
: create, fetch, 
define, def ine_override,cons,new, 
oo defined, 
pick_from_domain, domain_is_in,domain_iter entries, 
equal 
where domain has equa’: proctype(domain, domain) returas(bool), 
range has equal:proctype(range,range) returns(bool) 
rep = oneof[empty:naull, 
onedef ined: entry) 
entry = struct[d: domain, 
r:range, 
rest:map[domain, range]] 


x returns a function which is ungenine? everywhere 
create = proc() returns(cvt) 
return(rep$make_empty(n41)) 
end create 


% if fun(x) is defined, then fun(x) is returned, else signals undefined 
fetch = proc(fun:map[domain,range}, x:domain) returns(range) 
signals( undef ined) 
for d:domain,r:range in entries(fun) do 
if d=x then retura(r) end 
end 
signal undefined 
end fetch 


% if fun(x) is defined, then returns true, else false 
defined = proc(fun:map[domain,range], x:domain) returns(boo?) 
fetch(fun,x) except when undefined: return(false) end 
return(true) 
end defined 


% if fun(x) is defined to be different from f_fo_x, 
es % then signals already_defined 
% otherwise, returns .a function which is the same as fun, except that 
xX it is defined to be f_of_x at x. 
define = proc(fun:map[domain,range], x:domain, f_of_x:range) 
returns(cvt) signals(already_def ined) 
df fun[x]=f_of_x then returna(down(fun)) 
else signal already_defined end 
except when undefined: . 
return(repSmake_onedef ined( 
entry${d:x,r:f_of_x,rest:fun})) 
end 
end define 


% an internal routine which signals SAME if fun(x)=f_of_x 
% if fun(x) is undefined signals undefined 
% and otherwise returns a function which is the same as fun, except that 
% fun[x]}*f_of_x 
do_define_override = proc(fun:cvt, x:domain, f_of_x:range) 
returns(cvt) signals(same, undef ined) 
tagcase fun 
tag empty: signal undefined 
tag onedefined(e:entry): 
if e.d=x then 
if e.r*f_lof_x then signal same 
else return(rep$make_onedef tned( 
entry$replace_r(e,f_of_x})) 
end 
else return(rep$make_onedef ined( 
entrySreplace_rest(e,do_define_overr ide( 
e.rest,x,f_of_x)))) 
resignal same,undef ined 
end 
—_ end 
end do_def ine_override 
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% returns fun, except that it is defined to be f_of_x at x 
% this overrides any old defns that fun had 
define_override = proc(fun:map[domain,range}, x:domain, f_of_x:range) 
returns(map[ domain, range}) 
% we must get rid of the previous definition, so we can‘t do it 
% smoothly by just consing a new thing onto the head 
return(do_def ine_overr ide(fun,x, f_of. —)) 
except when same: return(fun) 
when undefined: 
return(up(rep$make_ onedef ined( 
entry${d:x,r:f_of_x,rest:fun}))) 
end 
end def ine_override 


new = proc() returns(map[domain,range]) retura(create()) end new 


cons = proc(ents: sequence[struct[d:domain,r:range]]) 
returns(map[domain, range]) 
signals(not_we1l_def ined) 
en=struct[d:domain,r:range] 
ret:map[domain, range]: =map[domain, range ]Screate() 
for e:en in sequence[enj$Selements(ents) do 
ret:=define(ret,e.d,e.r) 
except when already_defined: signal not_well_defined end 
end : 
return(ret) 
end cons 


% if fun is undefined forall values then signals (none), 
% else returns a value for which fun is defined 
pick_from_domain = proc(fun:cvt) returns(domain) signals{none) 
tagcase fun 
tag empty: signal none 
tag onedefined(e:entry): retura(e.d) 
end 
end pick_from_domain 


x if domain( fun) is in superdomain returns true, else false 
domain_is_in = proc(fun:map[domain,range}, superdomain: set[domain]) 
returns(bool) 
for d:domain in domain_iter(fun) do 
If ~set[domain]$ElementOf(d, copeveematad then return(false) end 
end 
return(true) 
end domain_is_in 


% yields all the values in domain(fun) ane n* pease, tli eaten A aes os 


domain_iter = iter(fun:map[domain,range)) yields(domain) 
for d:domain,r:range in entries(fun) do 
yieid(d) 
end 
end domain_iter 


% yields the pairs (d,r) where r=fun[d], and d is in the domain(fun) 
entries = iter(fun:evt) yields(domain,range) 
while (true) do 
tagcase fun 
tag empty: return 
tag onedefined(e:entry): yleld(e.d,e.r) fun: *down(e.rest) 
end 
end 
ond entries 


equal = proc(fi,f2:map[domain,range]) returns(bool) 
d:domain r:range 
begin 
for d,r in entries(f1) do 
if f2[d]~=r then return(false) end 
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end 
for d.r in entries(f2) do 
if fifcj~=sr then return(false) end 
end 
end 
except when undefined: return(false) end 
return( true) 
end equal 
end map 
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#extend 
alphabet=struct[class:string, 

is_terminator:bool, 

name: string] 
nodenamesint % we will use negatives if we need dummy's 
classname=string 
vimsota=sota[ alphabet, nodename classname] 
tmap=map[ alphabet, oneof[ acceptor: null ,node:nodename]} [ 
‘vimsotarep=struct[equivs:ernn, closures:tnsa, transitions: tntano] 
ernn=equivrel[nodename ] 
snn=set[nodename] 
tnsa = map{nodename, sa] 
saxset[alphabet} 
tntano=map{ nodename, tano) 
tn_ent=struct[d:nodename,r:tano] 
ta_ent=struct[d:alphabet,r:no]) 
ts_ent«struct[d:nodename,r:sa] 
tano=map[alphabet,no] 
nosoneof[acceptor:null, node: nodename] 


% this routine does some testing on the sota 
sotatest = proc() 

vsr=vimsotarep 

putl=stream$put) 

po:stream: =stream$primary_output() 


% first test, do a create, and get the rep which should be totally empty 


s_create:vimsota:=vimsota$creata() 

sexpect("s_create”,s_create, 
vsr${equivs:ernn$[], closures:tnsa$[], transitions: tntano$[]}, 
true) 


% now we have really tested the create out. That really only 
% gives us a little confidence in the lower level objects, 

% since create is so simple. 

noa:no: sno$make_acceptor(ni1) 


nol:no: sno$make_node(1) no2:no:=no$make_node(2) no3:no:=no$make_node(3) 
no4:no: =no$make_node(4) noS:no:=no$make_node(5) no6:no: =no$make_node(5) 


a_int:alphabet:*alphabetS{class:"INT", is_terminator: TRUE, name: "INT°} 
a_string: alphabet: = . : 
alphabet${class: "STRING", is_terminator: TRUE, name: "STRING"} 


a_real:alphabet:*alphabet${class: "REAL", is_terminator: TRUE, name: “REAL"} 


a_array:alphabet:= 

alphabetS{class: "ARRAY", is_terminator:FALSE, name: "ARRAY"} 
a_geta:alphabet:= 

alphabetS{class:"STRUCT", is_terminator: FALSE, name:"GET_A"} 
a_getb: alphabet: 

alphabet${class:"STRUCT", is_terminator: FALSE, name: "GET_B"} 


agetc:alphabet:= . eh hela Dae pint pa te OB een 


alphabet${class: "STRUCT", is_terminator: FALSE, name: "GET_C"} 


% lets try equating two nodes. We should then get an ambiguous error 
x if we try to get the typemap 
s_1e2:vimsota: =v imsota$equate(s_create,1,2) 
sexpect("s_162",s_1e2, 

vsr${equivs: ernn$[snn$[1, 23]. 

closures:tnsa$[], transitions: tntano$[]}, 

faise) 
% the transitions and closures should be completely undefined 
% the equivclass should have exactly {1,2} tn it 


X try something really fancy: 
areal problem: Ni is an array of n2_ 
N2 is an Int 
does it work? 


N1 = ARRAY[N2] 
N2 = ARRAY[N1] 
does it work? 
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Ni * INT 
N2 = ARRAY 
does it not work? 


N1 = ARRAY[N2] 
N3 = ARRAY[N4] 
N1 = N3 
does it work? 


N1 = ARRAY[N2] 
N3 = ARRAY[N4] 
N2 = INT 
N4 = STRING 
N1 = N3 
does it not work 
N1 = CLOSED_STRUCT[A:N2,8:N3] 
N2 = INT 
N3 = STRING 
N1 = OPEN_STRUCT[A:N4] 
does it work 
Ni 


= CLOSED_STRUCT[A:N2,8:N3] 
= INT : ; : 
N3 = STRING 
® OPEN_STRUCT[C:N3] 
does it not work 


CLOSED_STRUCT[A:n2,B:N3] 
INT 
CLOSED_STRUCT[A:n5:b:n6] 
STRING 
N4 

does it work 


= 
> 
eeenun 


that pretty well tests the closure with equates 


now for some recursion 
N1 = ARRAY[N1] 
deos it work? 


N1 = ARRAY[N2] 
N2 = ARRAY[N1] 
does it work? 


N1 = CLOSED_STRUCT[a:N2, b:N3] 
N2 = CLOSED_STRUCT[a:N2, b:N4] 
N4 = W1 

does it work? _...... 


N1 = CLOSED_STRUCT[a:N2, 0:N3] 
N2 = CLOSED_STRUCT[a:N1, c:N3] 
does it work? 


CLOSED_STRUCT[A:N2, 8:N3] 
N1 

n2 

CLOSED_STRUCT[A:N2, C:N3] 
does it not work? 


=z 
w& 
eure 


test the terminators to see if it won't a)low has_path to be a non-terminator 


N1 © ARRAY[NZ] 
does it not work (ambiguity) 


the comments are repeated: 


try something really fancy: 
areal problem: WN1 is an array of n2 
N2 is an int 
does it work? 


$2 
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> 4 é 
ev:vsr:evsr${equivs:ernn${], 
closures: tnsa$[ts_ent${d:1,r:sa$[a_array]}, 
ts_ent${d:2,r:sa$[a_int]}), 
transitions: tntano$[ 
tn_ent${d:1,r:tano$[ta_ent${d:a_array,r: no2}}}. 
tn_ent${d:2,r:tano$[ta_ent${d:a_int,r:noa}]}J} 
sexpect("N2=INT,N1=ARRAY[N2]", 
vimsota$has_subpath_to( 
vimsota$close(vimsota$has_ closed_path(s_. create,2,a_int), 
1,sa${a_array]), 
1,a_array,2), 
ev, true) 
sexpect("N1=ARRAY[N2],N2*INT", 
vimsota$has_closed_path( 
vimsota$close(vimsota$has_subpath_to(s_create,1,a_array,2), 
1,sa$[a_array]), 
2,a_int), 
ev, true) 


%  N1 = ARRAY[N2] 
%  -N2 = ARRAY[N1] 
% does it work? 
sexpect("N1=ARRAY[N2],N2#ARRAY[N1]”, 
vimsota$close( 
vimsota$has_subpath_to 
(vimsota$close( 
vimsota$has_subpath_to(s_create,1,a_array,2), 
1,sa$[La_array]), 
2,a_array,1), 
2,sa$[a_array]). 
vsr${equivs:ernn$f{}, 
closures: tnsa$[ts_ent${d:1,r:sa$[a_array]}. 
ts_ent$(d:2,r:sa$[a_array]}], 
transitions: 
tntano$[tn_ent${d:1,r: tano$[ta_ ent${d:a_array,r:no2})}. 
tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:no1}}}}}. 
true) 


N1 = INT 
N2 = ARRAY[N1] 
does it work? 
expect ("N1=INT,N2*ARRAY[N1]", 
vimsota$close(vimsota$has_subpath_to( 
vimsotaShas_closed_path(s_create,1, mal 
2,a_array,2), 
2,sa$(a_array}), 
vsr${equivs: ernn$[}, & Bee nee: Seen pee 
closures: tnsa$[ts_ ent$(d: 1, r: saSCa_ int]}. 
ts_ent${d:2,r:sa$[a_array]}]. 


ae re de 


transitions: 
tntano$[tn_ent${d:1,r:tano$[ta_ent${d:a_int,r:noa}]}, 
tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:no1}]}J}, 
true) 


% same thing without the closure on the int 
sexpect("N1=NC_INT,N2=ARRAY[N1]", 
vimsota$close(vimsotaShas_subpath,_to( 
vimsotaShas_path(s_create,1,a_int), 
2,acarray,1), 
2,sa$[a_array)), 
vsrS${equivs:ernans[], 
closures: tnsa$[ts_ent${d:2,r:sa$[a_array]}], 
transitions: 
tntano$[tn_ent${d:1,r:tano$[ta_ent${d:a_int,r:noa}]}, 
tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:noi}J})}, 
false) 


% same thing, without the closure on the array 
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sexpect("N1=INT,N2=NC_ARRAY[N1}", 

vimsota$has_subpath_to( 

vimsota$has_closed_path(s_create,1,a_int), 

2,a_array,1), 
vsr${equivs:ernn$[J, 

closures: tnsa${ts_ent${d:1,r: saSCa_int})). 

transitions: 

tntano$[tn_ ent${d: 1,r:tano$[ta_ent${d:a_int,r: noa))}. 

tn_ent${d:2,r:tano$[ta_ent${d:a_array,r:no1}}}J}, 

false) 


N1 = ARRAY[N2] 
N3 = ARRAY[N4] 
N1 = N3 
‘N2 = INT 
should work when al? closed 
tmpl:vimsota:= vimsota$close{— 
vimsota$close( 
vimsota$has_subpath_to( 
vimsota$has_subpath_to(s_create,1,a_array,2), 
3,a_array,4), 
3,sa$[a_array]), 
1,sa$[a_array]) 
tmp1:= vimsota$has_closed_path(vimsotaS$equate(tmp1, 1,3), 
2,aint) 


ae 22 Fe Te 22 


sexpect("N1=A[N2],N3=A(N4],N1=N3,NZ=INT", 
tmp1, 
vsr${equivs:ernn$[snn$[1,3],snan$[2,4]}, 
closures: tnsa$[ts_ent${d:i,r:sa$[a_array)}, 
ts_ent${d:2,r:sa$[a_int]}, 
ts_ent${d:3,r:sa$[a_array]}, 
ts_ent${d:4,r:sa$[a_int]}]. 
transitions: 
tntano$[tn_ent${d:1,r:tano$[ta_ent${d:a_srray,r:no2}]}. 
tn_ent${d:2,r:tano${ta_ent${d:a_int,r:noa}]}, 
tn_ent${d:3,r:tano$[ta_ent${d:a_array.r:no4}}}, 
tn_ent${d:4,r:tano$[ta_ent${d:a_int,r:noa}]}J}. 


true) 
x N1 = ARRAY[N2] 
x N3 = ARRAY[N4] 
x N2 © INT 
x N4 = STRING 
z M1 = N3 
> 4 can't build it, don't even bother with the closures 


tmp: vimsota:= 
vimsotaShas_path(vimsotaShas_path(. hea twee At beets mie 
vimsota$has_subpath_ to( 
vimsota$has_subpath_to(s_create, 1, a_errey, 2). 
3,a_array,4), 
2,a_int), 
4,a.string) 
% should work up to here 
begin 
vimsotaSequate({tmp,1,3) 
stream$putl(streamSprimary_output(), "Can build exA, wrong") 
signal failure("Can build exA, wrong") 
end 
except when empty: 
stream$putl(stream$primary_output(),"Can't build exA, ok") 
end 


CLOSED_STRUCT[A:N2,8:N3] 
INT 

STRING 
OPEN_STRUCT[A:N4] 

should work 

tmp: =vimsotaShas_subpath_to( 


z 
BS 
eunt 
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vimsota$has_subpath_to( 
vimsota$close(s_create,1,sa$[a_geta,a_getb]). 
1,a.geta,2), 
1,a_getb,3) 
tmp: 2vimsota$has_closed_path(tmp,2,a_int) 
tmp: *vimsota$has_closed_path(tmp,3,a_string) 
tmp: =vimsota$has_subpath_to(tmp,1,a_geta,4) 
sexpect("N1=CS[A:N2,B:N3],N2=INT,N3+S .N1=O[a:n4]", tmp, 
vsr${equivs:eran$[snan$[2,4}], 


closures: tnsa$[ts_ent$({d:1 :sa$[a_geta,a_getb]}, 


wc 
ts_ent${d:2,r:sa$[a_int]}, 
ts_ent${d:3,r:sa$[a_string]}. 
ts_ent${d:4,r:sa$[a_int]}], 


transitions: 
tntano$[tn_ent${d:1,r:tano$[ta_ent${d:a_geta,r:no2}, 
ta_ent${d:a_getb,r:no3}J}, 
tn_ent${d:2,r:tano$[ta_ent${d:a_int,r:noa})},- 
tn_ent${d:3,r:tano$[ta_ent${d:a_string,r:noa}]}, 
tn_ent${d:4,r:tano${ta_ent${d:a_int,r:noa}j}]}, 
true) 


x 
x N1 © CLOSED_STRUCT[A:N2,B:N3] 
z N1 = OPEN_STRUCT[C:N3] 
x does it not work because of closure violation 
tmp: =vimsota$close(vimsota$has_subpath_to( 
vimsota$has_subpath_to(s_create,1,a_geta,2), 
1,a_getb,3), 
1,sa$[a_geta, a.getb]) 
begia 
vimsota$has_subpath_to(tmp,1,a_getc,3) 
signal faiture("Could build s_cab.pc, wrong”) 
end 
except when empty: stream$put1(stream$primary_output(), 
"Couldnt buld s.cab_ipc, ok") end 


2 N1 = OPEN_STRUCT[A:n2,B:N3] 
x N2 = INT 

x N4 = OPEN_STRUCT[A:n5:b:n6] 
p4 n& = STRING 

Zz Ni = N4 

4 does it not work 


tmp: =vimsotaShas_subpath_to(v imsota$has_subpath_to(s_create,1,a.geta,2), 
1,a_getb,3) 
tmp:=vimsota$has_closed_path( tmp,2,a_int) 
tmp: »vimsota$has_subpath_to(v imsotaShas_subpath_to(tmp,1,a_geta.§), 
- : T,agetb,6) - 22 eee 
tmp:*vimsota$has_closed_path(vimsotaSequate(tmp,1,4),6,a_string) 
—i4_trans:tano:=tano$[ta_ent${d:a_geta,r:no2}, 
ta_ent${d:a_getb,r:no3}] 

_25_trans:tano:@tano$[ta_ent${d:a_int,r:noa}) 
36_trans:tano:=tano$[ta_ent${d:a_string,r:noa}) 
mytrans:tntano:= tntano${tn_ent${d:1, r:_14_trans}, 

tn_ent${d:4, r:_14_trans}, 

tn_ent${d:2, r:_25_trans}, 

tnlent${d:5, r:_26_trans}, 

tn_ent${d:3, r:_36_trans}, 

tn_ent${d:6, ¢:_36_trans}] 


sexpect("Two-defined struct unclosed”,tmp, 
vsr${equivs:ernn$[snn$[2,5],san$[3,6],snn$[1,4]]}, 
closures: tnsa$[ts_ent${d:2,r:sa$[a_int]}, 
ts_ent${d:5,r:sa$[a_int]}, 
ts_ent${d:3,r:sa$[a_string]}, 
ts_ent${d:6,r:sa$[a_string]}]. 
transitions:mytrans}, 
false) 
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sexpect("Two-defined struct closed", 
vimsota$close(tmp,1,sa$[a_geta,a_getb]), 
vsr${equivs:ernn${[snn$[2,5],snn$[3,6),snn$[1,4]], 
closures: tnsa$[ts_ent${d:2,r:sa$[a_int]}, 
ts_entS${d:5,r:sa$[a_int]}, 


5.r 
ts_ent${d:3,r:sa$[a_string}}, 
ts_ent${d:6,r:sa$[a_string]}, 
ts_ent${d:1,r:sa$[a_geta,a_getd]}, 
ts_ent${d:4,r:sa$[a_geta,a_getb]}]. 

transitions:mytrans}, 
true) 


that pretty wel? tests the closure with equates 
now for some recursion 
N1 = ARRAY[NI1] 
does it work? 
expect("N1=A[N1]", 
vimsota$close(vimsota$has_subpath_ to(s_. create,1,avarray,1), 
1,sa$[a_array]), 
vsr${equivs:ernn$[], 
closures: tnsa$[ts_ent${d: 1, r:sa$[a_array]}], 
transitions: 
tatano$[ tn ee 1, r:tano$[ta_ent${d:a_array,r: n01}]}1). 
true) 


DM Be BE 2 2e 


z N1 = ARRAY[N2] 
% NW2 = ARRAY[N1] | 
sexpect("N1=A[N2] N2=A[N1]", 
vimsota$has_subpath_to( 
vimsota$has_subpath_to( 
vimsota$close( 
vimsota$close(s_create,2,sa$[a_array]). 
i,sa${a_array}), 
1,a_array,2), 2,a_array,1), 
vsr${equivs:ernn${ J, 
closures: tnsa$[ts_ent${d:1,r:sa$fa_array]}. 
ts_ent${d:2,r:sa$fa_array]}}]. 
transitions: 
tntano$[tn_ent${d:1, r:tano$[ta_ent${d:a_array,r:no2}]}, 
tn_ent${d:2, r:tano$([ta_entS{d:a array. not}}}]}. 
true) 


N1 = closed _STRUCT[a:N2, b:N3] 
N2 = open_STRUCT[a:N2, b:N4] 
N4 = 1 

N2 = N3 

N3 = N4 


de 28 Be 22 22 


% everything should come out to be the same thing | ..-~------ --8 ee 


tmp: =vimsota$close(vimsota$has_subpath_to(. 
vimsotaShas_subpath_to(s_create,1, ictia: 2), 
> a_getb, 3), 
1,sa$[a_geta,a_getb}) 
tmp: =vimsotaS$has_subpath_to(v imsota$has_subpath_to(tmp,2,a_geta,2), 
2,a.getb,4) 
tmp: =vimsotaSequate(vimsota$equate(vimsotaSequate(tmp,1,4),2,3).3,4) 
_trans:tano:=tano$[ta_ent${d:a.geta,r:not}, 
ta_ent${d:a_getb,r:no2}] 
_close:sa:=sa$[a_geta,a_getd] 
sexpect("Ex C", tmp, 
vsrS${equivs: ernn$[san$[1,3 2,3,4]}, © 
closures: tnsa$[ts_ent${d:1,r:_ close}, 
ts_ent${d:2.r:_close}, 
ts_ent${d:3,r:_close}, 
ts_ent$(d:4,r:_close}], 
transitions: tntano$[tn_ent${d:i,r:_trans}, 
tn_ent${d:2,r:_trans}, 
tnlent${d:3,r:_trans}, 
tn_ent${d:4,r:_trans}]}, 
true) 
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N1 «© CLOSED_STRUCT[a:N2, b:N3] 
N2 = CLOSED_STRUCT[a:N1, ¢:N3] 
N3 = INT 
does it work? 
tmp: =vimsota$has_closed_path(s_create,3,a_int) 
tmp: =vimsota$close(vimsota$has_subpath_to( 
vimsota$has_subpath_to(tmp,1, manatee 2). 
1,a_getb,3), 
1, sa$[a_geta,a_getb}) 
tmp: =vimsota$close(vimsotaShas_ subpath_to( 
vimsota$has_subpath_to(tmp,2,a_geta,1), 
2,a_getc,3), 
2,sa$[a_geta,a_getc]) 


3€ 22 de 2 


sexpect( 
; "Ex D”, tmp, 
vsr${equivs:ernn$[], 
closures: tnsa$[ts_ent${d:1,r:sa$[a_geta,a_getb]}, 
ts_ent${d:2,r:sa$[a_geta,a_getc]}}, 
ts_ent${d:3,r:sa$[a_int]}j, 
transitions: tntano$[ 
tn_ent${d:1,r:tano$[ta_ent${d:a_geta,r:no2}, 
ta_ent${d:a_getb,r:no3}]}, 
tn_ent${d:2,r:tano$[ta_ent${d:a_geta,r:nol}, 
ta_ent${d:a_getc,r:no3}]J}, 
tn_ent${d:3,r:tano$[ta_ent${d:a_int,r:noa}]}J}, 
true) 


sexpect("Cliosure, but not all there", 
vimsotaS$close(vimsota$has_subpath_to( 
vimsota$has_closed_path(s_. create,2, aint), 
1,a.geta,2), 
1,sa$[a_geta,a_getb]), 
vseS${equivs:ernn$[], 
closures: tnsa$[ts_ent${d:1,r:sa$[a_geta,a.getb]}, 
ts_ent${d:2,r:sa$[a_int}}], 
transitions: 
tntano$[tn_ent${d:1,r:tano$[ta_ent${d:a_geta,r:no2}}}, 
tn_ent${d:2,r:tano$[ta_ent${d:a_int,r:noa}]}]}, 
false) 
% check for class error 
tmp:=vimsota$has_subpath_to(s_create,1,a_geta,2) 
begin 
tmp: =vimsota$has_subpath_to(tmp,1,a_array,2) 


signal failure("class error 1 not caught") eee AOS Ie 


end except when empty: streansputl(streanSor inary_outout()., 
“class error 1 caught ok") 
end 


tmp: =vimsota$has_path(s_create,1,a_int) 
begin 
tmp: =vimsotaShas_path(tmp,i,astring) 
signal failure("class error 2 not caught") 
end except when empty: stream$puti(streamSprimary_output(), 
"class error 2 caught ok") 
end 


% check for path with non-terminator error 
begin 

tmp: =v imsota$has_path( s_ create,1,a_array) 

signal failure("has_path with non terminator not caught") 

end except when non_terminator: 

stream$put}(stream$primary_output(), 
“has_path with non terminator caught ok") 
end 
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% check for subpath_to with terminator error 


begin 
tmp: *vimsotaShas_subpath_to(s_create,1,a_int,2) 
os signal failure("has_subpath_to with terminator not caught”) 


end except when terminator: 
streamSputt(streamSprimary_output(). 
“has_subpath_to with terminator caught ok") 
_ ond ; 


end sotatest 


x if the rep of the mysota is not equal to expected_rep then prints an 
% error, other wise prints “ok” 
sexpect=proc(name:string, mysota:vimsota, expected_rep:vimsotarep, guta: bool) 
own po: Stream: =stream$pr imary_output( ) 
died:bool:=false 
exp:vimsotarep: =vimsota$export(mysota) 
stream$puts(po,name) 
if exp.equivs*expected_rep.equivs then 
stream$puts(po,” equivs ok,”) 
else 
stream$puts(po," equivs beokens) 
died: =true 
end 
1f exp.closures*expected_rep.closures then 
stream$puts(po,” closures ok,") 
else 
stream$puts(po," closures broken,”) 
died: =true 
end 
% have to do the mapping test badly, sigh, this is because 
x i am really modeling (nodename,alphabet)->snodename, but 
% ended up using nodename->(nodename->alphabet) 
trandied:bool:=false 
begin 
OS for tn:nodename, ta:tano in tntano$entries(exp.transitions) do 
for ts:aliphabet, tno:no in tano$entries(ta) do ; 
etn:no:*expected_rep.transitions[tn][ts] 
tagcase tno 
tag acceptor: if etn~=tno then exit bad_map end 
tag node(num: int): 
4f ~set[ int ]SE lementOf (no$value_node(etn), 
exp.equivs[{num}) 
then exit bad_map end 
end 
end 
end : 
for tn:nodename, ta:tano in tntanoSentries(expected_ rep. transitions) 00 Bt 
for ts:alphabet, tno:no in tanoSentries(ta) do 
etn:no:*exp.transitions[tn][ts] 
tagcase tno 
tag acceptor: if etn~=tno then exit bad_map end 
tag node(num: tat): 
4f ~set[int]SElementOf(no$value_node(etn), 
expected_rep.equivs[aum]) 
then exit bad_map end 
end 
end 
ond 
end 
except when undefined, bad_map,wrong_type: trandied:=true died:*true end 


4f trandied then stream$puts(po,” transitions broken,”) 
else stream$puts(po,” transitions ok,") end 


begin 
vimsota$get_unique_ type_ assignment(mysota) 
on if guta then stream$puti(po,” guta defined ok") 


else 
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Stream$putl(po," expected guta ambiguity, it wasn’t”) 
died: =trua 
end 
end 
except when ambiguous: 
if guta then 
stream$putl(po,” but expect guta defined, it wasn't") 
died: =true 


else 
stream$putl(po," guta ambiguous ok") 
end 
end 
if died then signal failure("died---") end 
end sexpect 
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